Class NIODataStoreFactory

  • All Implemented Interfaces:
    DataStoreFactory

    public class NIODataStoreFactory
    extends java.lang.Object
    implements DataStoreFactory

    Builder for a datastore that has no practical file size limit.

    This implementation of the data store factory uses longs as indecies internaly, so can be used with files exceeding 2 gigs in size.

    The data store file has the following structure.

     file: header, hash table, nameTable, hitTable
    
     header:
       long hashTablePos, // byte offset in file
       long hitTablePos,  // byte offset in file
       long nameTablePos, // byte offset in file
       int wordLength,
       int serializedPackingLength,
       byte[] serializedPacking
    
     hashTable:
       long hashTableLength,
       long[hashTableLength] hits // byte offset into hitTable
    
     nameTable:
       int nameTableSize, // size in bytes
       (short nameLength, char[nameLength] name)[nameTableSize] names
    
     hitTable:
       long hitTableSize, // size in bytes
       hitTableRecord[hitTableSize] hits
    
     hitTableRecord:
       int hitCount,
       hitRecord[hitCount] hit
    
     hit:
       long seqOffset, // byte offset into sequence names table
       int pos         // biological position in sequence
     

    Author:
    Matthew Pocock
    • Constructor Detail

      • NIODataStoreFactory

        public NIODataStoreFactory()
    • Method Detail

      • getDataStore

        public DataStore getDataStore​(java.io.File storeFile)
                               throws java.io.IOException
        Description copied from interface: DataStoreFactory
        Get a pre-built data store associated with a file.
        Specified by:
        getDataStore in interface DataStoreFactory
        Parameters:
        storeFile - the File to map in as a data store
        Returns:
        the DataStore made by mapping the file
        Throws:
        java.io.IOException - if the file could not be mapped
      • buildDataStore

        public DataStore buildDataStore​(java.io.File storeFile,
                                        SequenceDB seqDB,
                                        Packing packing,
                                        int wordLength,
                                        int threshold)
                                 throws IllegalAlphabetException,
                                        java.io.IOException,
                                        BioException
        Description copied from interface: DataStoreFactory
        Build a new DataStore.
        Specified by:
        buildDataStore in interface DataStoreFactory
        Parameters:
        storeFile - the file to store the data store
        seqDB - the SequenceDB to store in the data store
        packing - the Packing used to bit-encode the sequences
        wordLength - the number of symbols per word
        threshold - the number of times a word must appear to be ignored
        Throws:
        IllegalAlphabetException - if the packing does not agree with the sequences
        BioException - if there is a problem building the data store
        java.io.IOException