speechtotext.benchmark.benchmarks.Benchmark

class Benchmark(with_cleaning=True)[source]

Bases: ABC

Benchmark is used to test/validate an model. Parent class for all benchmark classes.

Create benchmark object.

Parameters:

with_cleaning (bool, optional) – Clean. Defaults to True.

Methods

convert_to_pandas

Convert metrics to dataframe.

create_models

Creates an list of ModelWrappers.

save_to_csv

Save outputs of benchmark to csv.

set_dataset

Set dataset for Benchmark class.

update_samples

Update the sample dataset.

Attributes

BENCHMARK_SAMPLES

Dataset samples.

DATASET

Original dataset.

ERROR_LIST

List of errors.

BENCHMARK_SAMPLES: Dataset = None

Dataset samples.

Type:

Dataset

DATASET: Dataset = None

Original dataset.

Type:

Dataset

ERROR_LIST: list[DataFrame] = []

List of errors.

Type:

list[pd.core.frame.DataFrame]

__call__(number_of_samples, with_cleaning=True)[source]

Benchmark n samples.benchmark_results_to_csv

Parameters:
  • number_of_samples (int) – Number of samples to benchmark.

  • with_cleaning (bool, optional) – Set True to clean transcripts. Defaults to True.

convert_to_pandas()[source]

Convert metrics to dataframe.

Returns:

Pandas dataframe.

Return type:

pd.core.frame.DataFrame

abstract create_models()[source]

Creates an list of ModelWrappers.

Returns:

List of model wrappers.

Return type:

list[ModelWrapper]

save_to_csv(save_name)[source]

Save outputs of benchmark to csv.

Parameters:

save_name (str) – Filename of output.

classmethod set_dataset(dataset)[source]

Set dataset for Benchmark class.

Parameters:

dataset (Dataset) – Dataset to use with benchmark.

classmethod update_samples(cls, number_of_samples)[source]

Update the sample dataset.

Parameters:

number_of_samples (int) – Number of samples.