MinKnow is the interface software to the Minion handheld device, for configurations, sequencing initiation and transfer of the sequencing signals from the device to the computer.
By default MinKnow generates FASTQ files (basecalled) from the fast5 files (raw signal files) in real time.
Otherwise, after the sequencing finished, one can separately use the MinKnow integrated basecaller or another basecaller (such as Albacore) to generate fastq files from the fast5 files.
Optionally MinKnow can be used to base-call the signals also in realtime using the integrated Guppy basecaller. This option is enabled by default. It might be better to skip the real time base-calling, especially if you are not sure about the amount of sequencing signals of your DNA sample or not satisfied with the hardware specifications (SSD storage capacity, memory, CPU) of your host computer.
Albacore is/was one of the recommended basecallers, an example of call which is tuned for our first sequencing data is:
Run it using
$ read_fast5_basecaller.py --flowcell FLO-MIN106 --kit SQK-RAD004 --output_format fast5,fastq --input fast5/ --save_path fast5-albacore/ --worker_threads 4 -r
For details, please double check the documentation and guide in the Nanopore website.
Kraken2 is an open source softwares that is recommended for metagenomic analysis of Nanopore data. The databases for several domains are integrated and available on the Street Science Galaxy
Kraken2 generated a tabular file as well as a report.
The report is a text file with a tree-like structure that can be downloaded and viewed in an editor. E.g.
39.40 1335 1335 U 0 unclassified 60.60 2053 0 R 1 root 60.60 2053 0 R1 131567 cellular organisms 60.60 2053 0 D 2759 Eukaryota 60.60 2053 0 D1 33154 Opisthokonta 60.60 2053 0 K 4751 Fungi 60.60 2053 2 K1 451864 Dikarya 60.48 2049 0 P 4890 Ascomycota 60.48 2049 1 P1 716545 saccharomyceta 60.18 2039 0 P2 147537 Saccharomycotina 60.18 2039 0 C 4891 Saccharomycetes 60.18 2039 5 O 4892 Saccharomycetales 59.80 2026 19 F 4893 Saccharomycetaceae 56.55 1916 34 G 4930 Saccharomyces 53.34 1807 0 S 4932 Saccharomyces cerevisiae 53.34 1807 1807 S1 559292 Saccharomyces cerevisiae S288C 2.21 75 75 S 1080349 Saccharomyces eubayanus
Where the column fields are:
1. Percentage of fragments covered by the clade rooted at this taxon 2. Number of fragments covered by the clade rooted at this taxon 3. Number of fragments assigned directly to this taxon 4. A rank code, indicating (U)nclassified, (R)oot, (D)omain, (K)ingdom, (P)hylum, (C)lass, (O)rder, (F)amily, (G)enus, or (S)pecies. Taxa that are not at any of these 10 ranks have a rank code that is formed by using the rank code of the closest ancestor rank with a number indicating the distance from that rank. E.g., "G2" is a rank code indicating a taxon is between genus and species and the grandparent taxon is at the genus rank. 5. NCBI taxonomic ID number 6. Indented scientific name
We would like now to visualize this information using Krona
(*) Some reformatting of the data might be needed for a proper visualization, please refer to the history.
A Galaxy history for the kickoff sequencing data is available on streetscience.usegalaxy.eu
This will generate an interactive html chart.
Upload the fastq file
The fastq is the ultimate output of sequencing that is generated by MinKnow internal basecaller or by alternative basecaller like Albacore.
Click on the Results buttom from the samples tab on the top left panel
It might take a few minutes until the results are getting available. You should see something like this:
Codex reports an intuitive interactive html output of the classification. The visually interesting result would be Taxonomy Chart of Classified Reads