speechtotext.metric.metrics.Metrics
- class Metrics(reference, hypothesis, audio_id, duration, with_cleaning=True)[source]
Bases:
objectClass to calulate the metrics.
- mer
Match error rate (MER).
The MER indicates the percentage of words that were incorrectly predicted and inserted.
- Type:
- wip
Word information preserved (WIP).
The WIP represents the word information that is preserved.
- Type:
- cer
Character error rate (CER).
The CER is how many characters there were made errors on.
- Type:
- substitutions
Number of words substituted (substitutions).
The substitutions is the number of words that were replaced.
- Type:
- insertions
Number of words inserted (insertions).
The insertions is the number of words that were added.
- Type:
- hits
Number of words correct (hits).
The hits is the number of words correctly predicted.
- Type:
- deletions
Number of words deleted (deletions).
The deletions is the number of words that were removed.
- Type:
- duration
Duration of the transcribing (duration).
The duration is how long it took to transcribe the audiofile.
- Type:
- meteor
Metric for Evaluation of Translation with Explicit ORdering (METEOR).
METEOR is an automatic metric for machine translation evaluation that is based on a generalized concept of unigram matching between the machine-produced translation and human-produced reference translations.
- Type:
- bleu
Bilingual Evaluation Understudy (BLEU).
BLEU is used in comparing a candidate translation to one or more reference translations.
- Type:
- rouge_1_r
Recall-Oriented Understudy for Gisting Evaluation recall of 1-grams (ROUGE-1-r).
ROUGE includes measures to automatically determine the quality of a summary by comparing it to other (ideal) summaries created by humans. ROUGE-1-r is the recall of 1-grams.
- Type:
- rouge_1_p
Recall-Oriented Understudy for Gisting Evaluation precision of 1-grams (ROUGE-1-p).
ROUGE includes measures to automatically determine the quality of a summary by comparing it to other (ideal) summaries created by humans. ROUGE-1-p is the precision of 1-grams.
- Type:
- rouge_1_f
Recall-Oriented Understudy for Gisting Evaluation F1-score of 1-grams (ROUGE-1-f).
ROUGE includes measures to automatically determine the quality of a summary by comparing it to other (ideal) summaries created by humans. ROUGE-1-f is the F1-score of 1-grams.
- Type:
- rouge_2_r
Recall-Oriented Understudy for Gisting Evaluation recall of 2-grams (ROUGE-2-r).
ROUGE includes measures to automatically determine the quality of a summary by comparing it to other (ideal) summaries created by humans. ROUGE-2-r is the recall of 2-grams.
- Type:
- rouge_2_p
Recall-Oriented Understudy for Gisting Evaluation precision of 2-grams (ROUGE-2-p).
ROUGE includes measures to automatically determine the quality of a summary by comparing it to other (ideal) summaries created by humans. ROUGE-2-p is the precision of 2-grams.
- Type:
- rouge_2_f
Recall-Oriented Understudy for Gisting Evaluation F1-score of 2-grams (ROUGE-2-f).
ROUGE includes measures to automatically determine the quality of a summary by comparing it to other (ideal) summaries created by humans. ROUGE-2-f is the F1-score of 2-grams.
- Type:
- rouge_l_r
Recall-Oriented Understudy for Gisting Evaluation recall of LCS (ROUGE-L-r).
ROUGE-L is based on the longest common subsequence (LCS) between our model output and reference. ROUGE-L-r is the recall of LCS.
- Type:
- rouge_l_p
Recall-Oriented Understudy for Gisting Evaluation precision of LCS (ROUGE-L-p).
ROUGE-L is based on the longest common subsequence (LCS) between our model output and reference. ROUGE-l-p is the precision of LCS.
- Type:
- rouge_l_f
Recall-Oriented Understudy for Gisting Evaluation F1-score of LCS (ROUGE-L-f).
ROUGE-L is based on the longest common subsequence (LCS) between our model output and reference. ROUGE-L-f is the F1-score of LCS.
- Type:
Class to calulate the metrics.
- Parameters:
Methods
Returns all descriptions of metrics returned by get_all_metric_names in the correct order.
Returns all possible metric names in a list.