Quickstart
The maldipickr package helps microbiologists reduce duplicate/clonal bacteria from their cultures and eventually exclude previously selected bacteria. maldipickr achieve this feat by grouping together data from MALDI Biotyper and helps choose representative bacteria from each group using user-relevant metadata – a process known as cherry-picking.
maldipickr cherry-picks bacterial isolates with MALDI Biotyper:
Using taxonomic identification report
First make sure maldipickr is installed and loaded, alternatively follow the instructions to install the package.
Cherry-picking four isolates based on their taxonomic identification by the MALDI Biotyper is done in a few steps with maldipickr.
Get example data
We import an example Biotyper CSV report and glimpse at the table.
report_tbl <- read_biotyper_report(
system.file("biotyper_unknown.csv", package = "maldipickr")
)
report_tbl %>%
dplyr::select(name, bruker_species, bruker_log) %>% knitr::kable()
name | bruker_species | bruker_log |
---|---|---|
unknown_isolate_1 | not reliable identification | 1.33 |
unknown_isolate_2 | not reliable identification | 1.40 |
unknown_isolate_3 | Faecalibacterium prausnitzii | 1.96 |
unknown_isolate_4 | Faecalibacterium prausnitzii | 2.07 |
Delineate clusters and cherry-pick
Delineate clusters from the identifications after filtering the reliable ones and cherry-pick one representative spectra.
Unreliable identifications based on the log-score are replaced by “not reliable identification”, but stay tuned as they do not represent the same isolates!
report_tbl <- report_tbl %>%
dplyr::mutate(
bruker_species = dplyr::if_else(bruker_log >= 2, bruker_species,
"not reliable identification")
)
knitr::kable(report_tbl)
name | sample_name | hit_rank | bruker_quality | bruker_species | bruker_taxid | bruker_hash | bruker_log |
---|---|---|---|---|---|---|---|
unknown_isolate_1 | NA | 1 | - | not reliable identification | NA | 3e920566-2734-43dd-85d0-66cf23a2d6ef | 1.33 |
unknown_isolate_2 | NA | 1 | - | not reliable identification | NA | 88a85875-eeb5-4858-966e-98a077325dc3 | 1.40 |
unknown_isolate_3 | NA | 1 | + | not reliable identification | 137408536 | 2d266f20-5428-428d-96ec-ddd40200794b | 1.96 |
unknown_isolate_4 | NA | 1 | +++ | Faecalibacterium prausnitzii | 137408536 | 2d266f20-5428-428d-96ec-ddd40200794b | 2.07 |
The chosen ones are indicated by to_pick
column.
report_tbl %>%
delineate_with_identification() %>%
pick_spectra(report_tbl, criteria_column = "bruker_log") %>%
dplyr::relocate(name, to_pick, bruker_species) %>%
knitr::kable()
#> Generating clusters from single report
name | to_pick | bruker_species | membership | cluster_size | sample_name | hit_rank | bruker_quality | bruker_taxid | bruker_hash | bruker_log |
---|---|---|---|---|---|---|---|---|---|---|
unknown_isolate_1 | TRUE | not reliable identification | 2 | 1 | NA | 1 | - | NA | 3e920566-2734-43dd-85d0-66cf23a2d6ef | 1.33 |
unknown_isolate_2 | TRUE | not reliable identification | 3 | 1 | NA | 1 | - | NA | 88a85875-eeb5-4858-966e-98a077325dc3 | 1.40 |
unknown_isolate_3 | TRUE | not reliable identification | 4 | 1 | NA | 1 | + | 137408536 | 2d266f20-5428-428d-96ec-ddd40200794b | 1.96 |
unknown_isolate_4 | TRUE | Faecalibacterium prausnitzii | 1 | 1 | NA | 1 | +++ | 137408536 | 2d266f20-5428-428d-96ec-ddd40200794b | 2.07 |
Using spectra data
In parallel to taxonomic identification reports, maldipickr process spectra data. Make sure maldipickr is installed and loaded, alternatively follow the instructions to install the package.
Cherry-picking six isolates from three species based on their spectra data obtained from the MALDI Biotyper is done in a few steps with maldipickr.
Get example data
We set up the directory location of our example spectra data, but
adjust for your requirements. We import and process the spectra which
gives us a named list of three objects: spectra, peaks and metadata
(more details in Value section of process_spectra()
).
spectra_dir <- system.file("toy-species-spectra", package = "maldipickr")
processed <- spectra_dir %>%
import_biotyper_spectra() %>%
process_spectra()
Delineate clusters and cherry-pick
Delineate spectra clusters using Cosine similarity and cherry-pick
one representative spectra. The chosen ones are indicated by
to_pick
column.
processed %>%
list() %>%
merge_processed_spectra() %>%
coop::tcosine() %>%
delineate_with_similarity(threshold = 0.92) %>%
set_reference_spectra(processed$metadata) %>%
pick_spectra() %>%
dplyr::relocate(name, to_pick) %>%
knitr::kable()
name | to_pick | membership | cluster_size | SNR | peaks | is_reference |
---|---|---|---|---|---|---|
species1_G2 | FALSE | 1 | 4 | 5.089590 | 21 | FALSE |
species2_E11 | FALSE | 2 | 2 | 5.543735 | 22 | FALSE |
species2_E12 | TRUE | 2 | 2 | 5.633540 | 23 | TRUE |
species3_F7 | FALSE | 1 | 4 | 4.889949 | 26 | FALSE |
species3_F8 | TRUE | 1 | 4 | 5.558884 | 25 | TRUE |
species3_F9 | FALSE | 1 | 4 | 5.398429 | 25 | FALSE |
This provides only a brief overview of the features of maldipickr, browse the other vignettes to learn more about additional features.
Session information
sessionInfo()
#> R version 4.4.1 (2024-06-14)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 22.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
#>
#> locale:
#> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
#> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
#> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
#> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices datasets utils methods base
#>
#> other attached packages:
#> [1] maldipickr_1.3.2
#>
#> loaded via a namespace (and not attached):
#> [1] jsonlite_1.8.8 dplyr_1.1.4 compiler_4.4.1
#> [4] renv_1.0.3 MALDIquant_1.22.3 tidyselect_1.2.1
#> [7] parallel_4.4.1 tidyr_1.3.1 jquerylib_0.1.4
#> [10] systemfonts_1.1.0 textshaping_0.4.0 yaml_2.3.10
#> [13] fastmap_1.2.0 R6_2.5.1 generics_0.1.3
#> [16] knitr_1.48 tibble_3.2.1 desc_1.4.3
#> [19] readBrukerFlexData_1.9.2 bslib_0.8.0 pillar_1.9.0
#> [22] rlang_1.1.4 utf8_1.2.4 cachem_1.1.0
#> [25] xfun_0.47 fs_1.6.4 sass_0.4.9
#> [28] cli_3.6.3 withr_3.0.1 pkgdown_2.1.0
#> [31] magrittr_2.0.3 digest_0.6.37 lifecycle_1.0.4
#> [34] vctrs_0.6.5 evaluate_0.24.0 glue_1.7.0
#> [37] ragg_1.3.3 coop_0.6-3 fansi_1.0.6
#> [40] rmarkdown_2.28 purrr_1.0.2 tools_4.4.1
#> [43] pkgconfig_2.0.3 htmltools_0.5.8.1