Dataset: Analysis protocols, sample information, code, and datasets associated with the manuscript Structure and long-term stability of the microbiome in diverse diatom cultures from samples collected in Guam, California, and Gulf of Mexico between 2008 and 2016

Name: Analysis protocols, sample information, code, and datasets associated with the manuscript Structure and long-term stability of the microbiome in diverse diatom cultures from samples collected in Guam, California, and Gulf of Mexico between 2008 and 2016
Published: 2021-07-15
Keywords: diatom, microbiome, oceans

Final no updates expectedDOI: 10.26008/1912/bco-dmo.855750.1Version 1 (2021-07-15)Dataset Type:Other Field Results

Principal Investigator: James Jeffrey Morris (University of Alabama at Birmingham)

Co-Principal Investigator: Matt Ashworth (University of Texas at Austin)

BCO-DMO Data Manager: Amber D. York (Woods Hole Oceanographic Institution)

Project: Collaborative Research: Ecology and Evolution of Microbial Interactions in a Changing Ocean (LTPE)

Abstract

Analysis protocols, sample information, code, and datasets associated with the manuscript Structure and long-term stability of the microbiome in diverse diatom cultures. Sequence data is available from the NCBI SRA archive, BioProject PRJNA706454.

Cite Dataset

Spatial Extent: N:33.71 E:-80.78 S:13.25 W:-215.30
Temporal Extent: 2008-08-18 - 2015-11-19

XXX

Views

XX

Downloads

X

Citations

Download All Add to Cart

File(s)	Type	Description	Action
longterm_microbe_sample_info.csv (3.45 KB)	Comma Separated Values (.csv)	Primary data file for dataset ID 855750	Add to Cart Download
Ashworth_data_and_analysis.zip (18.51 KB)	ZIP Archive (ZIP)	READ ME for Filho et al. (2021) data and data analysis package. Code and data files described below are packaged within Ashworth_data_and_analysis.zip. References to figures and tables below refer to the results paper "Structure and Long-Term Stability of the Microbiome in Diverse Diatom Cultures," Filho et al. (2021). This data archival package contains 11 files: 1. This readme.txt file 2. Ashworth.mothurcode.txt which contains the mothur code used to analyze 16S sequencing data 3. Ashworth.rcode.txt which contains the R code used for some of the statistical analysis of the mothur output 4-11. Comma-separated values spreadsheets containing the raw data used in the R analyses. Ashworth.mothurcode.txt: To execute this code, first download the fastq sequence files from the NCBI SRA archive, BioProject PRJNA706454. Execute mothur from the directory containing these files, and then run each line in order. You will need to adjust the paths to the Silva 16S databases based on your own system; instructions on where to find these databases can be found on the mothur wiki page. NOTE: there are a few lines of R code commented into the mothur code. These require a package called SRS that executes the ranked subsampling algorithm described in the manuscript text. The comments explain how to transfer your mothur data to R, execute the SRS code, and then transfer the output back into mothur. Throughout the mothur code there are commented lines showing output relevant to our data analysis. These correspond to results reported in the manuscript and can be helpful guideposts if you are trying to replicate our results. Ashworth.rcode.txt To execute this code, set your R working directory to the location of the .csv files contained in this data archive. You should then be able to run all of the code at once, replicating our statistical analyses and re-creating our figures. Key results are included as commented lines. NOTE: at the end of this file we have included, as commented lines, the results of our online blastn analyses with full details on the best hits for our unidentified bacteria. Ashworth_Culture1.csv This file shows the relative abundances of the 10 most common OTUs in Culture 1 at each of the 4 sampled time points. The top row indicates the elapsed time since cultivation for each sample. This file is one of the inputs used to create the Muller plot in Figure 2. Ashworth.Muller.csv This file is the other input necessary for creating Figure 2. All it shows is that none of the OTUs are lineal descendants of any of the others. axes.csv This is the main data file containing the mothur output regarding diversity, culture identity, and ordination results. Columns are as follows: Culture: Unique diatom cultures as described in Table 1 Group: Code signifying which fastq files correspond to each sample Species: Diatom species Site: Which of the specific sampling locations the culture was collected at Locale: More coarse-grained region where the culture was collected Class: Diatom class Order: Diatom order Time: Time between culture isolation and DNA extraction MostAbundOTU: OTU number that was most abundant in each sample MostAbundTax: Taxonomy of the most abundant OTU PropMostAbund: Relative abundance of the most abundant OTU PropUbiquitous: Proportion of each sample comprised of the 32 ubiquitous OTUS NumberUbiquitous: Number of the 32 ubiquitous OTUs detected in the sample axis1, axis2, axis3: coordinates for each sample from NMDS jabund analysis chao, chao_lci, chao_hci: chao index with low and high confidence intervals coverage: estimated sequencing coverage of the sample sobs: number of OTUs in the sample shannon, shannon_lci, shannon_hci: Shannon diversity index with low and high confidence intervals invsimpson, invsimpson_lci, invsimpson_hci: Inverse Simpson diversity index with low and high confidence intervals CorrAxes.csv These coordinates describe the vectors of the 10 most abundant OTUs with a significant impact on the position of samples in the NMDS plot. The "Cultures" column indicates how many different cultures each OTU was detected in. SharedOTUs.csv, SharedOTUs.astrosyne.csv, SharedOTUs.gabgab.csv These files show the number of OTUs shared between pairs of samples. Files show either all samples, only the samples from Astrosyne radiata cultures, or only the samples from culture originally collected at Gab Gab Beach in Guam. ubiquitous.csv This file shows the relative abundances of the 32 ubiquitous OTUs in each of the 15 samples. It was used to create the hierarchical clustering plot in Figure 3.	Add to Cart Download

See "Data Files" section for access to download the data and analysis code "Ashworth_data_and_analysis.zip". The sample information and genetic accession identifiers are available as a data table from this page.

Related Datasets

Morris, J. J., University of Alabama at Birmingham (2021). Structure and Long-Term Stability of the Microbiome in Diverse Diatom Cultures. 2021/03. In NCBI:BioProject: PRJNA706454. Bethesda, MD: National Library of Medicine (US), National Center for Biotechnology Information; Available from: http://www.ncbi.nlm.nih.gov/bioproject/PRJNA706454.

Related Publications

Results

Barreto Filho, M. M., Walker, M., Ashworth, M. P., & Morris, J. J. (2021). Structure and Long-Term Stability of the Microbiome in Diverse Diatom Cultures. Microbiology Spectrum. doi:10.1128/spectrum.00269-21

Methods

Caporaso, J. G., Lauber, C. L., Walters, W. A., Berg-Lyons, D., Lozupone, C. A., Turnbaugh, P. J., … Knight, R. (2010). Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proceedings of the National Academy of Sciences, 108(Supplement_1), 4516–4522. doi:10.1073/pnas.1000080107

Methods

Guillard, R. R. L. (1975). Culture of Phytoplankton for Feeding Marine Invertebrates. Culture of Marine Invertebrate Animals, 29–60. doi:10.1007/978-1-4615-8714-9_3

Methods

Kozich, J. J., Westcott, S. L., Baxter, N. T., Highlander, S. K., & Schloss, P. D. (2013). Development of a Dual-Index Sequencing Strategy and Curation Pipeline for Analyzing Amplicon Sequence Data on the MiSeq Illumina Sequencing Platform. Applied and Environmental Microbiology, 79(17), 5112–5120. doi:10.1128/aem.01043-13

Software

R Core Team (2020). R: A language and environment for statistical computing. R v4.0.3. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/

Loading...Still loading...Hang on... This is taking longer than expected!

Dataset: Analysis protocols, sample information, code, and datasets associated with the manuscript Structure and long-term stability of the microbiome in diverse diatom cultures from samples collected in Guam, California, and Gulf of Mexico between 2008 and 2016

Final no updates expectedDOI: 10.26008/1912/bco-dmo.855750.1Version 1 (2021-07-15)Dataset Type:Other Field Results

Principal Investigator: James Jeffrey Morris (University of Alabama at Birmingham)

Co-Principal Investigator: Matt Ashworth (University of Texas at Austin)

BCO-DMO Data Manager: Amber D. York (Woods Hole Oceanographic Institution)

Project: Collaborative Research: Ecology and Evolution of Microbial Interactions in a Changing Ocean (LTPE)

Abstract

Cite Dataset

Spatial Extent: N:33.71 E:-80.78 S:13.25 W:-215.30
Temporal Extent: 2008-08-18 - 2015-11-19

XXX

Views

XX

Downloads

X

Citations

Download All Add to Cart

File(s)	Type	Description	Action
longterm_microbe_sample_info.csv (3.45 KB)	Comma Separated Values (.csv)	Primary data file for dataset ID 855750	Add to Cart Download
Ashworth_data_and_analysis.zip (18.51 KB)	ZIP Archive (ZIP)	READ ME for Filho et al. (2021) data and data analysis package. Code and data files described below are packaged within Ashworth_data_and_analysis.zip. References to figures and tables below refer to the results paper "Structure and Long-Term Stability of the Microbiome in Diverse Diatom Cultures," Filho et al. (2021). This data archival package contains 11 files: 1. This readme.txt file 2. Ashworth.mothurcode.txt which contains the mothur code used to analyze 16S sequencing data 3. Ashworth.rcode.txt which contains the R code used for some of the statistical analysis of the mothur output 4-11. Comma-separated values spreadsheets containing the raw data used in the R analyses. Ashworth.mothurcode.txt: To execute this code, first download the fastq sequence files from the NCBI SRA archive, BioProject PRJNA706454. Execute mothur from the directory containing these files, and then run each line in order. You will need to adjust the paths to the Silva 16S databases based on your own system; instructions on where to find these databases can be found on the mothur wiki page. NOTE: there are a few lines of R code commented into the mothur code. These require a package called SRS that executes the ranked subsampling algorithm described in the manuscript text. The comments explain how to transfer your mothur data to R, execute the SRS code, and then transfer the output back into mothur. Throughout the mothur code there are commented lines showing output relevant to our data analysis. These correspond to results reported in the manuscript and can be helpful guideposts if you are trying to replicate our results. Ashworth.rcode.txt To execute this code, set your R working directory to the location of the .csv files contained in this data archive. You should then be able to run all of the code at once, replicating our statistical analyses and re-creating our figures. Key results are included as commented lines. NOTE: at the end of this file we have included, as commented lines, the results of our online blastn analyses with full details on the best hits for our unidentified bacteria. Ashworth_Culture1.csv This file shows the relative abundances of the 10 most common OTUs in Culture 1 at each of the 4 sampled time points. The top row indicates the elapsed time since cultivation for each sample. This file is one of the inputs used to create the Muller plot in Figure 2. Ashworth.Muller.csv This file is the other input necessary for creating Figure 2. All it shows is that none of the OTUs are lineal descendants of any of the others. axes.csv This is the main data file containing the mothur output regarding diversity, culture identity, and ordination results. Columns are as follows: Culture: Unique diatom cultures as described in Table 1 Group: Code signifying which fastq files correspond to each sample Species: Diatom species Site: Which of the specific sampling locations the culture was collected at Locale: More coarse-grained region where the culture was collected Class: Diatom class Order: Diatom order Time: Time between culture isolation and DNA extraction MostAbundOTU: OTU number that was most abundant in each sample MostAbundTax: Taxonomy of the most abundant OTU PropMostAbund: Relative abundance of the most abundant OTU PropUbiquitous: Proportion of each sample comprised of the 32 ubiquitous OTUS NumberUbiquitous: Number of the 32 ubiquitous OTUs detected in the sample axis1, axis2, axis3: coordinates for each sample from NMDS jabund analysis chao, chao_lci, chao_hci: chao index with low and high confidence intervals coverage: estimated sequencing coverage of the sample sobs: number of OTUs in the sample shannon, shannon_lci, shannon_hci: Shannon diversity index with low and high confidence intervals invsimpson, invsimpson_lci, invsimpson_hci: Inverse Simpson diversity index with low and high confidence intervals CorrAxes.csv These coordinates describe the vectors of the 10 most abundant OTUs with a significant impact on the position of samples in the NMDS plot. The "Cultures" column indicates how many different cultures each OTU was detected in. SharedOTUs.csv, SharedOTUs.astrosyne.csv, SharedOTUs.gabgab.csv These files show the number of OTUs shared between pairs of samples. Files show either all samples, only the samples from Astrosyne radiata cultures, or only the samples from culture originally collected at Gab Gab Beach in Guam. ubiquitous.csv This file shows the relative abundances of the 32 ubiquitous OTUs in each of the 15 samples. It was used to create the hierarchical clustering plot in Figure 3.	Add to Cart Download

Related Datasets

Related Publications

Results

Methods

Guillard, R. R. L. (1975). Culture of Phytoplankton for Feeding Marine Invertebrates. Culture of Marine Invertebrate Animals, 29–60. doi:10.1007/978-1-4615-8714-9_3

Methods

Software

R Core Team (2020). R: A language and environment for statistical computing. R v4.0.3. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/

Dataset: Analysis protocols, sample information, code, and datasets associated with the manuscript Structure and long-term stability of the microbiome in diverse diatom cultures from samples collected in Guam, California, and Gulf of Mexico between 2008 and 2016

Principal Investigator: James Jeffrey Morris (University of Alabama at Birmingham)

Co-Principal Investigator: Matt Ashworth (University of Texas at Austin)

BCO-DMO Data Manager: Amber D. York (Woods Hole Oceanographic Institution)

Project: Collaborative Research: Ecology and Evolution of Microbial Interactions in a Changing Ocean (LTPE)

Abstract

Metadata

XXX

XX

X

Related Datasets

Related Publications

Dataset: Analysis protocols, sample information, code, and datasets associated with the manuscript Structure and long-term stability of the microbiome in diverse diatom cultures from samples collected in Guam, California, and Gulf of Mexico between 2008 and 2016

Principal Investigator: James Jeffrey Morris (University of Alabama at Birmingham)

Co-Principal Investigator: Matt Ashworth (University of Texas at Austin)

BCO-DMO Data Manager: Amber D. York (Woods Hole Oceanographic Institution)

Project: Collaborative Research: Ecology and Evolution of Microbial Interactions in a Changing Ocean (LTPE)

Abstract

Metadata

XXX

XX

X

Related Datasets

Related Publications