Release v2.1.0 · obss/jury

What's New 🚀

Tasks 📝

We added task based new metric system which allows us to evaluate different type of inputs rather than old system which could only evaluate from strings (generated text) for only language generation tasks. Hence, jury now is able to support broader set of metrics works with different types of input.

With this, on jury.Jury API, the consistency of set of tasks given is under control. Jury will raise an error if any pair of metrics are not consistent with each other in terms of task (evaluation input).

AutoMetric ✨

AutoMetric is introduced as a main factory class for automatically loading metrics, as a side note load_metric is still available for backward compatibility and is preferred (it uses AutoMetric under the hood).
Tasks are now distinguished within metrics. For example, precision can be used for language-generation or sequence-classification task, where one evaluates from string (generated text) while other one evaluates from integers (class labels).
On configuration file, metrics can be now stated with HuggingFace's datasets' metrics initializiation parameters. The keyword arguments for metrics that are used on computation are now separated in "compute_kwargs" key.

Full Changelog: 2.0.0...2.1.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.1.0

What's New 🚀

Tasks 📝

AutoMetric ✨