unclustered_recruitment.py

usage: unclustered_recruitment.py

Recruit unclustered contigs given metagenome annotations and Autometa binning
results. Note: All tables must contain a 'contig' column to be used as the
unique table index

optional arguments:
  -h, --help            show this help message and exit
  --kmers KMERS         Path to normalized kmer frequencies table. (default:
                        None)
  --coverage COVERAGE   Path to coverage table. (default: None)
  --binning BINNING     Path to autometa binning output [will look for
                        col='cluster'] (default: None)
  --markers MARKERS     Path to domain-specific markers table. (default: None)
  --output-binning OUTPUT_BINNING
                        Path to output unclustered recruitment table.
                        (default: None)
  --output-main OUTPUT_MAIN
                        Path to write Autometa main table used during/after
                        unclustered recruitment. (default: None)
  --output-features OUTPUT_FEATURES
                        Path to write Autometa features table used during
                        unclustered recruitment. (default: None)
  --taxonomy TAXONOMY   Path to taxonomy table. (default: None)
  --taxa-dimensions TAXA_DIMENSIONS
                        Num of dimensions to reduce taxonomy encodings
                        (default: None)
  --additional-features [ADDITIONAL_FEATURES ...]
                        Path to additional features with which to add to
                        classifier training data. (default: [])
  --confidence CONFIDENCE
                        Percent confidence to allow classification (confidence
                        = num. consistent predictions/num. classifications)
                        (default: 1.0)
  --num-classifications NUM_CLASSIFICATIONS
                        Num classifications for predicting/validating contig
                        cluster recruitment (default: 10)
  --classifier {decision_tree,random_forest}
                        classifier to use for recruitment of contigs (default:
                        decision_tree)
  --kmer-dimensions KMER_DIMENSIONS
                        Num of dimensions to reduce normalized k-mer
                        frequencies (default: 50)
  --seed SEED           Seed to use for RandomState when initializing
                        classifiers. (default: 42)