Dfam Family Files ================= There are several ways to access Dfam families. We provide a website for detailed exploration at http://dfam.org, we offer an API for programmatic access to the database at http://dfam.org/api ( docs: http://dfam.org/help/api ), and we offer several options for offline access: o FamDB FamDB is a HDF5 based export format designed for efficient storage and retrieval of Dfam data for offline use. The V3 format provides the full Dfam export in components and partitions. Components are are divided into model-type and curation status: - Curated-consensus (CC): The consensus sequences (and metadata) for the curated [DF families] in Dfam. - Uncurated-consensus (UC): The consensus sequences (and metadata) for the uncurated [DR families] in Dfam. - Curated-HMM (CH): The profile HMMs (and metadata) for the curated formed [DF families] in Dfam. - Uncurated-HMM (UH): The profile HMMs (and metadata) for the uncurated formed [DR families] in Dfam. Each component further divided by taxonomic partitions to further reduce the size of downloads. This two-tier segmentation allows for smaller downloads for the most common use-cases. For example, if you are only interested in the curated consensus sequences you can download only the CC component which currently conveniently fits in a single partition file. All components and parititions are optional and may be loaded in any combination. The only file required to use FamDB is the root partition which contains the component and partition metadata for a given release. Please see the FamDB/README.txt for more information on how to use this format. o HMM Files for use with nhmmer: - Dfam-#.hmm.gz : Files containing profile HMMs for *all* Dfam families (DF [curated], and DR [uncurated ). See the userman.txt file for details on the HMM format. - Dfam-curated_only-#.hmm.gz : A file containing profile HMMs for curated Dfam families (DF records only). o EMBL files for use with consensus-based tools: - Dfam-#.embl.gz A file containing EMBL records for *all* Dfam families (DF [curated], and DR [uncurated ). - Dfam-curated_only-#.embl.gz A file containing EMBL records for curated Dfam families (DF records only). A md5sum file ( `*.md5sum` ) is provided for each product for download validation. For more information on the metadata in the EMBL and HMM files, see Dfam's userman.txt [3]. [1]: http://hmmer.org/ [2]: https://github.com/Dfam-consortium/FamDB/ [3]: https://www.dfam.org/releases/current/userman.txt Using Dfam with RepeatMasker ============================ RepeatMasker (4.2.4) and future versions will no longer be packaged together with the FamDB tool/data. This will allow users to download FamDB separately and update one or the other as needed. RepeatMasker will still work without FamDB provided users have custom library files to search with. Deprecated Support Files ======================== Previous releases of RepeatModeler uses these reference files for classification. Newer versions obtain this data from FamdDB directly. These are Dfam 3.9 versions here to avoid breaking older versions of Repeatmodeler. WARNING: Dfam 3.9 versions -------------------------- Dfam-RepeatMasker.lib.gz RepeatPeps.lib.gz