Parameter Settings



  HUMAnN2: The HMP Unified Metabolic Analysis Network 2

HUMAnN is a pipeline for efficiently and accurately profiling the presence/absence and abundance of microbial pathways in a community from metagenomic or metatranscriptomic sequencing data (typically millions of short DNA/RNA reads). This process, referred to as functional profiling, aims to describe the metabolic potential of a microbial community and its members. More generally, functional profiling answers the question, "What are the microbes in my community-of-interest doing (or capable of doing)?"

  HUMAnN2 outputs

1.The pre-processing output files are in "kneaddata_out", which also contains a logfile containing the commands run. There are many FASTQs in this directory, but we are only interested in the files ending in "_paired_1.fastq" and "_paired_2.fastq", which are the forward and reverse reads in pairs that both passed the filtering criteria and were not mapped as contaminants. If you are interested in seeing which reads were called as contaminants, they are in the "contam" FASTQs. The passing reads where the other read-pair failed either pre-processing steps are in the "unmatched" FASTQs. These sequences can still be useful, but in this case, we will exclude them from downstream steps.

2.Paired-end reads after pre-processing are in "cat_reads".

3.Results obtained by the HUMAnN2 algorithm are in "humann2_out",which contains gene families, path abundance, path coverage, and metaphlan2 run results for each sample.

4.Visualized results contain abundance heat map and gene family heat map.

fig1.Abundance heat map

fig2.Gene family heat map

HUMAnN2 Workflow

HUMAnN2 Features

  1. Community functional profiles stratified by known and unclassified organisms
  2. Considerably expanded databases of genomes, genes, and pathways
    • UniRef database provides gene family definitions
    • MetaCyc provides pathway definitions by gene family
    • MinPath is run to identify the set of minimum pathways
  3. A simple user interface (single command driven flow)
    • The user only needs to provide a quality-controlled metagenome or metatranscriptome
  4. Accelerated mapping of reads to reference databases (including run-time generated databases tailored to the input)
    • Bowtie2 is run for accelerated nucleotide-level searches
    • Diamond is run for accelerated translated searches

If you use this pipeline in your work, please cite :

Franzosa EA*, McIver LJ*, Rahnavard G, Thompson LR, Schirmer M, Weingart G, Schwarzberg Lipson K, Knight R, Caporaso JG, Segata N, Huttenhower C. Species-level functional profiling of metagenomes and metatranscriptomes. Nat Methods 15: 962-968 (2018).