Data

This module contains wrappers for pandas.DataFrame data for estimators.

Storage

This is wrapper for pandas.DataFrame, which allows you to define dataset for estimator by a simple way.

class rep.data.storage.LabeledDataStorage(data, target=None, sample_weight=None, random_state=None, shuffle=False)

Bases: object

This class implements interface of data for estimators training. It contains data, labels and weights - all information to train model.

Parameters:
  • ds (pandas.DataFrame) – data
  • target (None or numbers.Number or array-like) – labels for classification and values for regression (set None for predict methods)
  • sample_weight (None or numbers.Number or array-like) – weight (set None for predict methods)
  • random_state (None or int or RandomState) – for pseudo random generator
  • shuffle (bool) – shuffle or not data
col(index)

Get necessary columns

Parameters:index (None or str or list(str)) – names
Return type:pandas.Series or pandas.DataFrame
eval_column(expression)

Evaluate some expression to get necessary data

Return type:numpy.array
get_data(features=None)

Get data for estimator

Parameters:features (None or list[str]) – set of feature names (if None then use all features in data storage)
Return type:pandas.DataFrame
get_indices()

Get data indices

Return type:numpy.array
get_targets()

Get sample targets for estimator

Return type:numpy.array
get_weights(allow_nones=False)

Get sample weights for estimator

Return type:numpy.array

Table Of Contents

Previous topic

Welcome to REP’s documentation!

Next topic

Estimators (classification and regression)

This Page