Input files

Pair fastq raw data and Sample information file (.zip)
Input files (.zip)
├── 16s_data
│   ├── A1T1_1.fastq
│   ├── A1T1_2.fastq
│   ├── B1T1_1.fastq
│   └── B1T1_2.fastq
└── map.txt

The sample map.txt is as follows:

#SampleIDBarcodeSequenceLinkerPrimerSequenceSampleTypeDescription
A1T1TACGCTGCTATCCTCTTCGTATCCTCT...WGCAGtumorA1T1
B1T1ATGCGCAGTATCCTCTTCGTATCCTCT...WGCAGnormalB1T1
Txt format OTU table from QIIME and Sample information file (.zip)
Input folder(.zip)
├── otu_table.txt
└── map.txt

The sample map.txt is as follows:

#SampleIDBarcodeSequenceLinkerPrimerSequenceSampleTypeDescription
A1T1TACGCTGCTATCCTCTTCGTATCCTCT...WGCAGtumorA1T1
B1T1ATGCGCAGTATCCTCTTCGTATCCTCT...WGCAGnormalB1T1

The input named "otu_table.txt" must be an OTU file in txt format generated by QIIME, the sample is as follows:

# Constructed from biom file
#OTU IDSample1Sample2Sample3taxonomy
1827711.00.00.0k__Bacteria; p__Firmicutes; c__Clostridia; o__Clostridiales; f__Clostridiaceae; g__SMB53; s__
433389711.00.02.0k__Bacteria; p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Bacteroidaceae; g__Bacteroides; s__caccae
4215385.00.02.0k__Bacteria; p__Actinobacteria; c__Actinobacteria; o__Actinomycetales; f__Micrococcaceae; g__Rothia; s__mucilaginosa
Biom format OTU table from QIIME and Sample information file (.zip)
Input folder(.zip)
├── otu_table.biom
└── map.txt

The sample map.txt is as follows:

#SampleIDBarcodeSequenceLinkerPrimerSequenceSampleTypeDescription
A1T1TACGCTGCTATCCTCTTCGTATCCTCT...WGCAGtumorA1T1
B1T1ATGCGCAGTATCCTCTTCGTATCCTCT...WGCAGnormalB1T1

The input named "otu_table.biom" must be an OTU file in biom format generated by QIIME, the sample is as follows:

{"id": "None","format": "Biological Observation Matrix 2.1.0","format_url": "http://biom-format.org","generated_by": "QIIME 1.9.0","date": "2020-01-05T05:29:50.472161","matrix_element_type": "float","shape": [1251, 8],"type": null,"matrix_type": "sparse","data": [[0,0,1.0],[1,1,1.0],[2,1,1.0] ... ,"rows": [{"id": "182771", "metadata": {"taxonomy": ["k__Bacteria", "p__Verrucomicrobia", "c__Verrucomicrobiae", "o__Verrucomicrobiales", "f__Verrucomicrobiaceae", "g__Akkermansia", "s__muciniphila"]}}, ... , "columns": [{"id": "Sample1", "metadata": null},{"id": "Sample2", "metadata": null},{"id": "Sample3", "metadata": null}]}

Example data sets for testing

Example 1
Gut microbiome data Pc (PRJNA302832) from 10 patients receiving ipilimumab treatment.
  16S rDNA data example
Input files

MAAWf supports 16S rDNA data in three formats: sequence, OTU (.txt) , OTU (.biom) .

1. For sequence, The input file must be a multi-fastq file named "16s_data" containing 16S rDNA read and a mapping file named "map.txt" . The entire file needs to be compressed in zip format. Pair-ended fastq files must be named after samplename_1.fastq and samplename_2.fastq, and the file structure is shown below:

Input files (.zip)
├── 16s_data
│   ├── A1T1_1.fastq
│   ├── A1T1_2.fastq
│   ├── B1T1_1.fastq
│   └── B1T1_2.fastq
└── map.txt

2. For OTU (.txt) ,The input named "otu_table.txt" must be an OTU file in txt format generated by QIIME and a mapping file named "map.txt" . The entire file needs to be compressed in zip format. The file structure is shown below:

Input folder(.zip)
├── otu_table.txt
└── map.txt

3. For OTU (.biom) ,The input named "otu_table.biom" must be an OTU file in biom format generated by QIIME and a mapping file named "map.txt" . The entire file needs to be compressed in zip format. The file structure is shown below:

Input folder(.zip)
├── otu_table.biom
└── map.txt

map.txt consists of five columns: #SampleID, BarcodeSequence, LinkerPrimerSequence, SampleType and Description. #SampleID is the sample number, not repeatable, and must be the same as the sample fileName. BarcodeSequence consisting of nucleotides should be unique for each row. LinkerPrimerSequence is associated with Barcode sequence, a common primer for sequencing. SampleType is the sample grouping. The description is any information relevant to the sample, but each sample must be unique. The sample is as follows:

#SampleIDBarcodeSequenceLinkerPrimerSequenceSampleTypeDescription
A1T1TACGCTGCTATCCTCTTCGTATCCTCT...WGCAGtumorA1T1
B1T1ATGCGCAGTATCCTCTTCGTATCCTCT...WGCAGnormalB1T1

Task ID

Task ID will be used as the unique identifier for running the parameter and result queries, so be sure to keep this id in mind.