Interface DataFileModel<P>

All Known Implementing Classes:
AbstractDataFileModel, DefaultDataFileModel, StandardIODataFileModel

public interface DataFileModel<P>
Data file model: the factory allowing to create and remove some file-like objects ("data files"). Used by the large memory model.

There are the following standard implementations of this interface:

You may create own implementations, or override some methods in standard ones, to get maximal control over the large memory model. The simplest example is overriding delete(DataFile) method to implement custom technique of file deletion, for instance, by moving them to some "Recycle Bin". Another example: you may override the methods recommendedNumberOfBanks(), recommendedBankSize(boolean) to specify custom values of number of banks and bank size in a concrete data file model.

The data file model uses a class, specified as the generic argument P and returned by pathClass() method, for working with data file paths — some unique names identifying file position in the file system. Usually, this class is java.io.File, and it describes standard path to a disk file. Custom implementations (not inherited from AbstractDataFileModel) may use another classes for specifying file paths.

Objects implementing this interface may be not immutable and not thread-safe, but must be thread-compatible (allow manual synchronization for multithread access).

Author:
Daniel Alievsky
  • Method Summary

    Modifier and Type
    Method
    Description
    Returns the set of all data files, that are temporary and should be automatically deleted while system shutdown.
    boolean
    If this method returns true, then mapping the data file by map(position, size) call automatically increases the file length to position+size if the current file length is less than this value.
    createTemporary(boolean unresizable)
    Creates new temporary data file and returns a new instance of DataFile object corresponding to it.
    boolean
    delete(DataFile dataFile)
    Deletes the data file.
    void
    finalizationNotify(P dataFilePath, boolean isApplicationShutdown)
    This method is automatically called when the data file becomes unreachable, either due to garbage collection (when all AlgART arrays, using this data file, became unreachable), or due to finishing the application (in the standard cleanup procedure, performed by this package).
    getDataFile(P path, ByteOrder byteOrder)
    Returns a new instance of DataFile object corresponding to the given path.
    getPath(DataFile dataFile)
    Returns the path describing unique position of the data file (usually the absolute path to the disk file).
    boolean
    Returns true if the standard cleanup procedure, that deletes all temporary files (as described in comments to allTemporaryFiles() method), is necessary for this file model.
    Returns the type of the data file paths used by this model.
    int
    recommendedBankSize(boolean unresizable)
    The size of every memory bank in bytes, recommended for data files created by this factory.
    int
    The number of memory banks, recommended for data files created by this factory.
    long
    The size (in bytes) of the starting gap in all temporary files, created by createTemporary(boolean) method.
    int
    If a mapped AlgART array is unresizable and it's size, in bytes, is less than or equal to the result of this method, then all data file is mapped by a single large bank.
    void
    setTemporary(DataFile dataFile, boolean value)
    If value is true, adds the passed data file instance into the internal collection returned by allTemporaryFiles() method; if value is false, removes it from that collection.
  • Method Details

    • pathClass

      Class<P> pathClass()
      Returns the type of the data file paths used by this model. Returned class is equal to the generic type argument of this class.

      This method never throws exceptions.

      Returns:
      the type of the data file paths used by this model.
    • getDataFile

      DataFile getDataFile(P path, ByteOrder byteOrder)
      Returns a new instance of DataFile object corresponding to the given path. This path will be returned by getPath(DataFile) method for the returned object.

      The passed byte order will be used for mapping this file: the DataFile.map(DataFile.Range, boolean) method of the data file will return ByteBuffer with this byte order.

      The physical object (for example, disk file), described by path string, should already exist. This method does not attempt to create physical file; it only creates new Java object associated with an existing file.

      This method never throws java.io.IOError.

      Parameters:
      path - the path describing unique position of the existing data file.
      byteOrder - the byte order that will be always used for mapping this file.
      Returns:
      new instance of DataFile object.
      Throws:
      NullPointerException - if one of passed arguments is null.
    • getPath

      P getPath(DataFile dataFile)
      Returns the path describing unique position of the data file (usually the absolute path to the disk file).

      This method never throws java.io.IOError.

      Parameters:
      dataFile - the data file.
      Returns:
      the path describing unique position of the data file.
      Throws:
      NullPointerException - if the argument is null.
      ClassCastException - if the data file was created by incompatible data file model.
    • createTemporary

      DataFile createTemporary(boolean unresizable)
      Creates new temporary data file and returns a new instance of DataFile object corresponding to it.

      The DataFile.map(DataFile.Range, boolean) method of the created data file will return ByteBuffer with some byte order: it depends on implementation.

      The returned instance is added to some internal collection, returned by allTemporaryFiles() method. This action is optional, but performed by all implementations from this package. In your implementation, you are able not to support this collection, if you are absolutely sure that automatic file deletion, performed by this package, as well as any possible custom cleanup procedures for temporary files, are not useful for your data files. (In this case, please keep in mind the Sun's bug #4171239 in Java 1.5 and 1.6: "java.io.File.deleteOnExit does not work on open files (win32)." Automatic deletion performed by this package includes closing the file, that allow to avoid this bug.)

      Parameters:
      unresizable - true if this file will be used for unresizable arrays only. It is information flag: for example, it may be used for choosing file name or directory. If this flag is set, it does not mean that DataFile.length(long) method will not be called to change the file length; it will be called at least once.
      Returns:
      new instance of DataFile object corresponding newly created temporary data file.
      Throws:
      IOError - in a case of any disk errors.
    • delete

      boolean delete(DataFile dataFile)
      Deletes the data file. Returns true if the file was successfully deleted or false this file does not exists (nothing to do). Usually means deletion of the disk file, but some file models may override this behavior (for example, may move the file in some special directory).

      This method is called automatically for temporary files by garbage collector and by standard cleanup procedure, performed by this package.

      Warning: unlike java.io.File.delete(), this method must throw an exception (java.io.IOError) in a case of some problems while file deletion. (java.io.File.delete() returns false in this situation.)

      In a case of successful deletion, this method excludes the path of this file from the internal set returned by allTemporaryFiles() method.

      The passed data file must be created by the same data file model.

      This method should be synchronized, usually in relation to the internal set, the copy or view of which is returned by allTemporaryFiles() method. The reason is that it is usually called from several threads: at least, from the main calculation thread, from the garbage collector (finalization thread) and from the built-in shutdown hook. If this method will not be internally synchronized, it may try to remove the same file several times, that will lead to logging a warning that the file "cannot be deleted".

      Parameters:
      dataFile - the data file that should be deleted.
      Returns:
      true if and only if the data file existed and was successfully deleted, false if the data file does not exist (maybe was deleted already).
      Throws:
      NullPointerException - if the passed data file is null.
      ClassCastException - if the data file was created by incompatible data file model.
      IOError - in a case of any problems while file deletion.
    • finalizationNotify

      void finalizationNotify(P dataFilePath, boolean isApplicationShutdown)
      This method is automatically called when the data file becomes unreachable, either due to garbage collection (when all AlgART arrays, using this data file, became unreachable), or due to finishing the application (in the standard cleanup procedure, performed by this package).

      Please compare: unlike delete(DataFile), this method is called not only for temporary files, but also for data files, opened via LargeMemoryModel.asArray / LargeMemoryModel.asUpdatableArray methods, and for underlying data file of arrays, that were declared non-temporary via LargeMemoryModel.setTemporary(Array, boolean) method.

      This method is called after all other operations with this data file, in particular, after automatic deleting it by delete(DataFile) method.

      The implementations of this method, provided by this package, do nothing. But you may override it in a custom data file model to inform application that the data file becomes unuseful and, for example, may be deleted by your non-standard file deletion mechanism.

      Please note: if LargeMemoryModel.asArray / LargeMemoryModel.asUpdatableArray methods are called several times for the same external file, then each call produces separate DataFile instance. So, this method will be called several times for this file (with the same dataFilePath argument).

      Parameters:
      dataFilePath - the path describing unique position of the data file.
      isApplicationShutdown - true if this method is called by the cleanup procedure, performed by this package, while finishing the application; false if it is called from the garbage collector.
    • allTemporaryFiles

      Set<DataFile> allTemporaryFiles()
      Returns the set of all data files, that are temporary and should be automatically deleted while system shutdown. The returned set is an immutable view or a newly allocated copy of an internal set stored in this instance. The returned instance must not be null (but may be the empty set).

      Usually this method returns the set of temporary files that were created by createTemporary(boolean) method by this instance of this factory, but not were successfully deleted by delete(DataFile) method yet.

      This package includes automatic cleanup procedure, that is performed in the internal shutdown hook and calls delete(DataFile) for all data files, returned by this method for all instances of this class, which were used since the application start and returned true as a result of isAutoDeletionRequested() method. You may install additional cleanup procedures, that will be called before or after this, via Arrays.addShutdownTask(Runnable, Arrays.TaskExecutionOrder) method.

      Returns:
      the set of the paths of all created temporary files.
    • setTemporary

      void setTemporary(DataFile dataFile, boolean value)
      If value is true, adds the passed data file instance into the internal collection returned by allTemporaryFiles() method; if value is false, removes it from that collection.

      This method does nothing if isAutoDeletionRequested() returns false.

      This method is called in LargeMemoryModel.setTemporary(net.algart.arrays.Array, boolean) method only.

      In some data file models (implemented in another packages) this method may do nothing or throw an exception.

      Parameters:
      dataFile - the data file.
      value - specifies whether the data file should be included into or excluded from the internal collection of temporary files.
      Throws:
      NullPointerException - if the passed data file is null.
    • isAutoDeletionRequested

      boolean isAutoDeletionRequested()
      Returns true if the standard cleanup procedure, that deletes all temporary files (as described in comments to allTemporaryFiles() method), is necessary for this file model. In this case, the instance of this model is automatically registered, while creating any LargeMemoryModel instance with this model, in the internal static collection that can be retrieved by LargeMemoryModel.allUsedDataFileModelsWithAutoDeletion() method.

      This method returns true for all implementations from this package. If you implemented own cleanup procedure in your implementation of this class, you may return false in this method. If this method returns false, the implementations of createTemporary(boolean) method in this package do not add the file name into the internal collection, returned by allTemporaryFiles() method.

      If this method returns false, it does not mean that temporary files will not be deleted automatically. It only means that this data file model instance will not be registered in the internal static collection (available via LargeMemoryModel.allUsedDataFileModelsWithAutoDeletion() method) and that the createTemporary(boolean) method will not register the file name in allTemporaryFiles() collection. To avoid automatic file deletion, you must call LargeMemoryModel.setTemporary(Array, boolean) method with false second argument.

      Returns:
      true if the temporary data files, created by this model, should be automatically deleted by the standard cleanup procedure.
    • recommendedNumberOfBanks

      int recommendedNumberOfBanks()
      The number of memory banks, recommended for data files created by this factory. AlgART arrays, based on data file mapping, allocate this number of memory banks and load there portions of large data file.

      The returned number of banks must not be less than 2. In other case, an attempt to create Array instance will throw an exception. Usual values are 8-16.

      Please note that many algorithms, on multiprocessor or multi-core systems, use several parallel threads for processing arrays: see Arrays.ParallelExecutor. So, the number of banks should be enough for parallel using by all CPU units, to avoid frequently bank swapping. There should be at least 2 banks per each CPU unit, better 3-4 banks (for complex random-access algorithms).

      Returns:
      the recommended number of memory banks.
    • recommendedBankSize

      int recommendedBankSize(boolean unresizable)
      The size of every memory bank in bytes, recommended for data files created by this factory. AlgART arrays, based on data file mapping, allocate memory banks with this size and load there portions of large data file.

      The unresizable flag specifies whether this bank size will be used for data file, which stores unresizable arrays only. In this case, this method may return greater value than if unresizable is false. The reason is that the data files, containing resizable arrays, may grow per blocks, which size is equal to the bank size. If bank size is 8 MB, then any resizable array, created by MemoryModel.newIntArray(long) or similar method, will occupy at least 8 MB of disk space, even its length is only several int values. For unresizable arrays, created by MemoryModel.newUnresizableIntArray(long) and similar methods, the file size is usually fixed while its creation and bank size information is not used.

      This returned size must be the power of two (2k) and must not be less than 256. In other case, an attempt to create Array instance will throw an exception.

      We recommend use large banks to reduce bank swapping. But do not specify too large values here: every opened data file use recommendedNumberOfBanks()*recommendedBankSize(boolean) bytes of the address space, which is limited by ~1.0-1.5 GB under 32-bit OS. Typical value is 2-8 MB for unresizable arrays (when the argument is true) and 64-256 KB for resizable ones (when the argument is false).

      Parameters:
      unresizable - true if this bank size will be used for unresizable arrays only.
      Returns:
      the recommended size of every memory bank in bytes.
      See Also:
    • recommendedSingleMappingLimit

      int recommendedSingleMappingLimit()
      If a mapped AlgART array is unresizable and it's size, in bytes, is less than or equal to the result of this method, then all data file is mapped by a single large bank. Usual bank size is ignored in this case, and only one bank is used.

      If this data file model is based on true low-level mapping, as DefaultDataFileModel, that the large value returned by this method allows to improve performance and also helps to avoid the Sun's bug in Java 1.5 and 1.6: "(fc) "Cleaner terminated abnormally" error in simple mapping test". In modern 32-bit JRE, the value about 16-32 MB looks suitable.

      We don't recommend to set this limit too large in 32-bit JRE: every mapping reduces available address space, that is limited by 1.0-1.5 GB only.

      If the result of this method is zero or negative, this behavior is not used.

      Returns:
      the recommended limit for file size, in bytes, so that less files, if they are unresizable, should be mapped only once by single call of DataFile.map(net.algart.arrays.DataFile.Range, boolean) method.
    • recommendedPrefixSize

      long recommendedPrefixSize()
      The size (in bytes) of the starting gap in all temporary files, created by createTemporary(boolean) method. The first element of the newly created arrays is placed at this offset in the data file.

      This gap will not be used for storing array elements, but may be used for saving some additional information (prefix) if you will decide not to remove this file while garbage collection, for example, with help of LargeMemoryModel.setTemporary(Array, boolean) method.

      If the result of this method is zero or negative, there will be no starting gap. (Negative values are interpreted as zero.)

      Returns:
      the size of the starting gap in the temporary files, in bytes.
      See Also:
    • autoResizingOnMapping

      boolean autoResizingOnMapping()
      If this method returns true, then mapping the data file by map(position, size) call automatically increases the file length to position+size if the current file length is less than this value. In this case, this package will not call DataFile.length(long) method for increasing the temporary file length.

      The described behavior of mapping usually depends on the platform. So, this method should return false in most cases.

      For unresizable files (i.e. for arrays that are created unresizable), the result of this method is not used: these files are never mapped outside their original lengths.

      Returns:
      true if mapping outside the file length automatically increase the length.