Related data table and dataset descriptions:
The primary data table for this dataset is provided under the "Data Files" section and contains total protein spectral counts while the table under "Supplemental Files" provides the exclusive protein spectral counts.
Total spectral counts refer to the total number of spectra with peptide to spectrum matches (PSMs) that matches to each entry within the FASTA sequence database. This approach allows each peptide to map to multiple closely related sequences. In contrast, with exclusive spectral counts each peptide is only allowed to map to one sequence within the FASTA database, and when a peptide is found in multiple database sequences the one with the most peptides mapping (parsimony) to it is selected. There are pros and cons to each approach, where total spectral counts will double count peptides when two similar proteins are compared, and exclusive spectral counts will underrepresent less abundant proteins with shared peptides, favoring the most homolog with the most shared peptides. Considering protein groups with shared peptides or focusing on peptide-level analyses are alternative approaches that could be constructed from these results.
See "Related Datasets" section for:
* "AE1913 Peptide Spectral Counts" which includes the individual peptides associated with these proteins (includes total spectral counts for each peptide).
* "AE1913 Protein Identification FASTA"
CTD and other data from the same cruise are listed on deployment page AE1913: https://www.bco-dmo.org/deployment/916412
These data will become part of the Ocean Protein Portal (https://proteinportal.whoi.edu/; Saito et al., 2020).
The assembly, annotations, metatranscriptomic assembly products, the same exclusive protein spectral counts, and other useful information associated with this multi-omic analysis was published as a package at Zenodo (doi: 10.5281/zenodo.8287779).