Delineate clusters from taxonomic identifications
Source:R/delineate_with_identification.R
delineate_with_identification.Rd
From the report of taxonomic identification produced by the Bruker MALDI Biotyper spectra sharing the same identification are labeled in the same cluster. Spectra with unknown identification (e.g., due to database completeness) are set in unique cluster.
Arguments
- tibble_report
A tibble of n rows, with n the number of spectra, produced by
read_biotyper_report()
orread_many_biotyper_reports()
. The long format and the best hits options are expected to be used in these functions to produce a compliant input tibble.
Value
A tibble of n rows for each spectra and 3 columns:
name
: the spectra names from thename
column from the output of eitherread_biotyper_report()
orread_many_biotyper_reports()
.membership
: integers stating the cluster number to which the spectra belong to. It starts from 1 to c, the total number of clusters.cluster_size
: integers indicating the total number of spectra in the corresponding cluster.
Details
As all unknown identification are considered unique clusters within one input tibble, it is important to consider whether the taxonomic identifications come from a single report or multiple reports, depending on the research question. A message is displayed to confirm from which type of reports the delineation was done.
Examples
report_unknown <- read_biotyper_report(
system.file("biotyper_unknown.csv", package = "maldipickr")
)
delineate_with_identification(report_unknown)
#> Generating clusters from single report
#> # A tibble: 4 × 3
#> name membership cluster_size
#> <chr> <int> <int>
#> 1 unknown_isolate_1 2 1
#> 2 unknown_isolate_2 3 1
#> 3 unknown_isolate_3 1 2
#> 4 unknown_isolate_4 1 2