SHOGUN  6.1.3
DataManager Class Reference

## Detailed Description

Class DataManager for fetching/streaming test data block-wise. It can handle data coming from multiple sources. The number of data sources is represented by the num_distributions parameter in the constructor of the data manager. It can handle heterogenous data sources, and it can stream multiple blocks per burst, as the computation would require. The size of the blocks and the number of blocks to be fetched per burst can be set externally.

This class is designed to be used on a stack. An instance of DataManager should not be serialzied or copied or moved around. In Shogun, it is helpful when used inside just the implementation inside a PIMPL.

Definition at line 63 of file DataManager.h.

## Public Member Functions

DataManager (index_t num_distributions)

DataManager (const DataManager &other)=delete

DataManageroperator= (const DataManager &other)=delete

~DataManager ()

void set_blocksize (index_t blocksize)

void set_num_blocks_per_burst (index_t num_blocks_per_burst)

InitPerFeature samples_at (index_t i)

CFeaturessamples_at (index_t i) const

index_tnum_samples_at (index_t i)

const index_t num_samples_at (index_t i) const

const index_t blocksize_at (index_t i) const

index_t get_num_samples () const

index_t get_min_blocksize () const

## Constructor & Destructor Documentation

 DataManager ( index_t num_distributions )

Default constructor.

Parameters
 num_distributions number of data sources (i.e. CFeature objects)

Definition at line 43 of file DataManager.cpp.

 DataManager ( const DataManager & other )
delete

Disabled copy constructor

Parameters
 other other instance
 ~DataManager ( )

Destructor

Definition at line 55 of file DataManager.cpp.

## Member Function Documentation

 const index_t blocksize_at ( index_t i ) const

Getter for the number of samples from a specified data source in a block.

Parameters
 i The data source index.
Returns
The number of samples from i-th data source in a block.

Definition at line 192 of file DataManager.cpp.

 index_t get_min_blocksize ( ) const
Returns
The minimum block-size that can be fetched from the specified data sources. For example, if there are two data sources, with samples 20 and 30, respectively, then minimum blocksize can be 5 (2 from 1st data source, 3 from the 2nd), and there can be then 10 such blocks.

Definition at line 72 of file DataManager.cpp.

 index_t get_num_samples ( ) const
Returns
Total number of samples that can be fetched from all the data sources.

Definition at line 59 of file DataManager.cpp.

 index_t & num_samples_at ( index_t i )

Setter for the number of samples. Setting this number is mandatory for streaming features. For other type of feature objects, this number equals the number of vectors, and is set internally.

Example usage:

DataManager data_mgr;
data_mgr.num_sample_at(0) = 10;
data_mgr.num_sample_at(1) = 15;
Parameters
 i The data source index, at which the number of samples is to be set.
Returns
A reference for the number of samples for the specified data source to be used as lvalue.

Definition at line 169 of file DataManager.cpp.

 const index_t num_samples_at ( index_t i ) const

Getter for the number of samples.

Parameters
 i The data source index, from which the number of samples is to be obtained.
Returns
The number of samples for the specified data source.

Definition at line 179 of file DataManager.cpp.

 DataManager& operator= ( const DataManager & other )
delete

Disabled assignment operator

Parameters
 other other instance
 InitPerFeature samples_at ( index_t i )

Setter for feature object as a data source. Since multiple data sources are supported, this method takes an index in which the feature object is set. Internally, it initializes a data fetcher object for the provided feature object.

Example usage:

DataManager data_mgr;
// feats_0 = some CFeatures instance
// feats_1 = some CFeatures instance
data_mgr.sample_at(0) = feats_0;
data_mgr.sample_at(1) = feats_1;
Parameters
 i The data source index, at which the feature object is to be set as a data source.
Returns
An initializer for the specified data source (that sets up a fetcher for this feature), to be used as lvalue.

Definition at line 146 of file DataManager.cpp.

 CFeatures * samples_at ( index_t i ) const

Getter for feature object at a give data source index.

Parameters
 i The data source index, from which the feature object is to be obtained
Returns
The underlying CFeatures object at the specified data source.

Definition at line 156 of file DataManager.cpp.

 void set_blocksize ( index_t blocksize )

Sets the blocksize for block-wise data fetching. It divides the block-size per data source according to the total number of feature vectors available from that source. More formally, if there are $$K$$ data sources, $$X_k$$, $$k=\[1,K]$$, with number of feature vectors $$n_{X_k}$$ from each, then setting a block-size of $$B$$ would mean that in each next() call of the data manager instance, it will fetch $$rho_{X_k} B$$ samples from each $$X_k$$, where $$rho_{X_k}=n_{X_k}/n$$, $$n=sum_k n_{X_k}$$.

Parameters
 blocksize The size of the block consisting of data from all the sources.

Definition at line 91 of file DataManager.cpp.

 void set_num_blocks_per_burst ( index_t num_blocks_per_burst )

In order to speed up the computation, usually a number of blocks are fetched at once per next() call. This method sets that number.

Parameters
 num_blocks_per_burst The number of blocks to be fetched in a burst.

Definition at line 117 of file DataManager.cpp.

The documentation for this class was generated from the following files:

SHOGUN Machine Learning Toolbox - Documentation