Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Branch][DeepSparse Evaluation API] Update lm-eval, perplexity, additional datasets #1580

Merged
merged 36 commits into from
Feb 9, 2024
Merged
Show file tree
Hide file tree
Changes from 35 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
6035536
initial implementation
dbogunowicz Jan 29, 2024
53cb9ec
initial commit
dbogunowicz Jan 30, 2024
6599f41
add some more tests for hardening
dbogunowicz Jan 30, 2024
4721c1f
Update src/deepsparse/evaluation/cli.py
dbogunowicz Jan 30, 2024
1247794
Update src/deepsparse/transformers/pipelines/text_generation/pipeline.py
dbogunowicz Jan 30, 2024
9e88f89
Apply suggestions from code review
dbogunowicz Jan 30, 2024
fdb21c6
quality
dbogunowicz Jan 30, 2024
be80132
Merge branch 'main' into feature/damian/ui_improvements
dbogunowicz Jan 30, 2024
5d40b8d
Merge remote-tracking branch 'origin/main' into feature/damian/fix_lm…
dbogunowicz Jan 31, 2024
3e5b7a8
fix the UI, implement loglikelihood function
dbogunowicz Feb 1, 2024
ff0944b
Merge branch 'main' into feature/damian/fix_lm_eval
dbogunowicz Feb 1, 2024
f38f0db
remove unneccessary file
dbogunowicz Feb 1, 2024
dd45493
Merge branch 'feature/damian/fix_lm_eval' of github.com:neuralmagic/d…
dbogunowicz Feb 1, 2024
cd10b92
Merge branch 'main' into feature/damian/ui_improvements
dbogunowicz Feb 1, 2024
b2aad17
initial commit
dbogunowicz Feb 2, 2024
35454a1
tests passing, refactor time!
dbogunowicz Feb 2, 2024
d3b84f8
cleanup
dbogunowicz Feb 2, 2024
e7d8c31
Update test_evaluator.py
dbogunowicz Feb 5, 2024
a148fc5
finished
dbogunowicz Feb 5, 2024
3b5977b
rebase
dbogunowicz Feb 5, 2024
a9e9847
quality
dbogunowicz Feb 5, 2024
787ee45
rebase
dbogunowicz Feb 5, 2024
b5a6d6d
manual testing
dbogunowicz Feb 5, 2024
d0698e7
Merge remote-tracking branch 'origin/main' into feature/damian/genera…
dbogunowicz Feb 5, 2024
e10f0c9
UI improvements
dbogunowicz Feb 5, 2024
48a5900
new UI adaptations
dbogunowicz Feb 6, 2024
44e3e6e
make test more lightweight
dbogunowicz Feb 6, 2024
abb6ab8
fix tests 2
dbogunowicz Feb 6, 2024
79fd7e0
Merge branch 'main' into feature/damian/generate_until
dbogunowicz Feb 7, 2024
e5aad65
good point Michael
dbogunowicz Feb 7, 2024
06302dc
Merge branch 'main' into feature/damian/generate_until
dbogunowicz Feb 8, 2024
d65cac6
Return to the name `lm-evaluation-harness` but add alias `lm-eval-har…
dbogunowicz Feb 8, 2024
e0b4f36
Merge branch 'main' into feature/damian/generate_until
dbogunowicz Feb 9, 2024
b82b49b
[DeepSparse Evaluation API] Perplexity (#1555)
dbogunowicz Feb 9, 2024
d4cdd98
Merge branch 'main' into feature/damian/generate_until
dbogunowicz Feb 9, 2024
7a3ad2f
move the registration of the perplexity eval function where it belongs
dbogunowicz Feb 9, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,7 @@ def _parse_requirements_file(file_path):
"datasets<2.16",
"accelerate<0.26",
"seqeval",
"evaluate",
]
_sentence_transformers_integration_deps = ["optimum-deepsparse"] + _torch_deps

Expand Down Expand Up @@ -308,7 +309,7 @@ def _setup_entry_points() -> Dict:
f"deepsparse.image_classification.eval={ic_eval}",
"deepsparse.license=deepsparse.license:main",
"deepsparse.validate_license=deepsparse.license:validate_license_cli",
"deepsparse.eval=deepsparse.evaluation.cli:main",
"deepsparse.evaluate=deepsparse.evaluation.cli:main",
]
}

Expand Down
14 changes: 6 additions & 8 deletions src/deepsparse/evaluation/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
Module for evaluating models on the various evaluation integrations

OPTIONS:
--model_path MODEL_PATH
MODEL_PATH
A path to an ONNX model, local directory containing ONNX model
(including all the auxiliary files) or a SparseZoo stub
-d DATASET, --dataset DATASET
Expand Down Expand Up @@ -72,7 +72,7 @@

from deepsparse.evaluation.evaluator import evaluate
from deepsparse.evaluation.results import Result, save_result
from deepsparse.evaluation.utils import args_to_dict, get_save_path
from deepsparse.evaluation.utils import get_save_path, parse_kwarg_tuples
from deepsparse.operators.engine_operator import (
DEEPSPARSE_ENGINE,
ORT_ENGINE,
Expand All @@ -88,12 +88,10 @@
ignore_unknown_options=True,
)
)
@click.option(
"--model_path",
@click.argument(
"model_path",
type=click.Path(dir_okay=True, file_okay=True),
required=True,
help="A path to an ONNX model, local directory containing ONNX model"
"(including all the auxiliary files) or a SparseZoo stub",
)
@click.option(
"-d",
Expand Down Expand Up @@ -178,7 +176,7 @@ def main(
# join datasets to a list if multiple datasets are passed
datasets = list(dataset) if not isinstance(dataset, str) else dataset
# format kwargs to a dict
integration_args = args_to_dict(integration_args)
integration_args = parse_kwarg_tuples(integration_args)

_LOGGER.info(
f"Creating {engine_type} pipeline to evaluate from model path: {model_path}"
Expand All @@ -203,7 +201,7 @@ def main(
**integration_args,
)

_LOGGER.info(f"Evaluation done. Results:\n{result}")
_LOGGER.info(f"Evaluation done. Results:\n{result.formatted}")

save_path = get_save_path(
save_path=save_path,
Expand Down
4 changes: 3 additions & 1 deletion src/deepsparse/evaluation/evaluator.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@
from typing import List, Optional, Union

from deepsparse import Pipeline
from deepsparse.evaluation.integrations.perplexity import ( # noqa
integration_eval as integration_eval_perplexity,
)
from deepsparse.evaluation.registry import EvaluationRegistry
from deepsparse.evaluation.results import Result
from deepsparse.evaluation.utils import create_pipeline
Expand Down Expand Up @@ -65,7 +68,6 @@ def evaluate(
return eval_integration(
pipeline=pipeline,
datasets=datasets,
engine_type=engine_type,
batch_size=batch_size,
splits=splits,
metrics=metrics,
Expand Down
6 changes: 3 additions & 3 deletions src/deepsparse/evaluation/integrations/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
# flake8: noqa: F401


def try_import_lm_evaluation_harness(raise_error=False):
def try_import_lm_evaluation_harness(raise_error=True):
try:
import lm_eval

Expand All @@ -24,11 +24,11 @@ def try_import_lm_evaluation_harness(raise_error=False):
if raise_error:
raise ImportError(
"Unable to import lm_eval. "
"To install run 'pip install "
"git+https://github.com/EleutherAI/lm-evaluation-harness@b018a7d51'"
"To install run 'pip install lm-eval==0.4.0'"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when or how will this error during normal use if raise_error=False by default? once the eval actually begins?

Copy link
Contributor Author

@dbogunowicz dbogunowicz Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Good point. Yes, I will change the default behavior of this function, and set raise_error to True.

This is the intended behavior when the acual eval is being ran. At runtime, when the user intends to use lm-eval, the module will try to do the hot import of the lm-eval. If it fails to find the dependency, installed, it will raise the error.

However, when testing, I do not want to raise errors, but use the output of this function (boolean) to skip the tests that require lm-eval installed.

)
return False


if try_import_lm_evaluation_harness(raise_error=False):
from .lm_evaluation_harness import *
from .perplexity import *
Loading
Loading