Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Unreleased
Changed
Small runtime improvements in Rust backend.
[1.0.1] - 2025-09-21
Fixed
Fix critical bug in
WeightedLevenshtein.from_dict
when using insertion costs.
Added
from ocr_stringdist import EditOperation
EditOperation.as_dict()
[1.0.0] - 2025-09-20
Changed
Rename “Learner” to “CostLearner”.
Rework and fix the cost learning algorithm.
Remove
with_cost_function
fromCostLearner
.Remove the functional interface in favour of
WeightedLevenshtein
class.
Added
Add
calculate_for_unseen
parameter toCostLearner.fit()
.Add input validation in
WeightedLevenshtein.__init__
.Add
to_dict
andfrom_dict
methods toWeightedLevenshtein
.
[0.3.0] - 2025-09-14
Added
Add the option to include the matched characters in the
explain
method via thefilter_matches
parameter.Add the option to learn the costs from a dataset of pairs (OCR result, ground truth) via the
WeightedLevenshtein.learn_from
method and theLearner
class.
Changed
Drop support for PyPy due to issues with PyO3.
[0.2.2] - 2025-09-01
Changed
Improve documentation.
[0.2.1] - 2025-08-31
Fixed
Documentation for PyPI
[0.2.0] - 2025-08-31
Added
WeightedLevenshtein
class for reusable configuration.Explanation of edit operations via
WeightedLevenshtein.explain
andexplain_weighted_levenshtein
.
[0.1.0] - 2025-04-26
Added
Custom insertion and deletion costs for weighted Levenshtein distance.
Changed
Breaking changes to Levenshtein distance functions signatures.