Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Branch] KV Cache Interface #1083

Merged
merged 109 commits into from
Jul 12, 2023
Merged

[Feature Branch] KV Cache Interface #1083

merged 109 commits into from
Jul 12, 2023

Commits on Jun 5, 2023

  1. initial commit

    dbogunowicz committed Jun 5, 2023
    Configuration menu
    Copy the full SHA
    48ac0ac View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    cf7f2b9 View commit details
    Browse the repository at this point in the history

Commits on Jun 6, 2023

  1. Configuration menu
    Copy the full SHA
    832630a View commit details
    Browse the repository at this point in the history

Commits on Jun 7, 2023

  1. Configuration menu
    Copy the full SHA
    9958c83 View commit details
    Browse the repository at this point in the history
  2. limit to 150mb

    dbogunowicz committed Jun 7, 2023
    Configuration menu
    Copy the full SHA
    e6d2b03 View commit details
    Browse the repository at this point in the history
  3. ready to review

    dbogunowicz committed Jun 7, 2023
    Configuration menu
    Copy the full SHA
    7f9935b View commit details
    Browse the repository at this point in the history

Commits on Jun 8, 2023

  1. initial commit

    dbogunowicz authored and markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    b1cf01b View commit details
    Browse the repository at this point in the history
  2. [Codegen][ORT][Static Seq Length] TextGenerationPipeline (#946)

    * initial commit
    
    * coreys simplifications
    
    * finishing the second model static
    
    * ready, time for beautification
    
    * ready for review
    
    * moved the code to examples
    
    * fix eos logic
    
    * add argument num_tokens_to_generate
    dbogunowicz authored and markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    0a3f48d View commit details
    Browse the repository at this point in the history
  3. [CodeGen][Documentation] (#956)

    * initial commit
    
    * coreys simplifications
    
    * finishing the second model static
    
    * ready, time for beautification
    
    * ready for review
    
    * moved the code to examples
    
    * fix eos logic
    
    * add argument num_tokens_to_generate
    
    * initial commit
    
    * change order
    
    * Update examples/codegen/README.md
    
    Co-authored-by: corey-nm <109536191+corey-nm@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: corey-nm <109536191+corey-nm@users.noreply.github.com>
    2 people authored and markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    add4625 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    22d2746 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    7f1651d View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    b85746d View commit details
    Browse the repository at this point in the history
  7. refactor sucessfull

    dbogunowicz authored and markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    aadc608 View commit details
    Browse the repository at this point in the history
  8. Pipeline fully refactored, time to test engine support. Note: Sliding…

    … window not yet implemented!
    dbogunowicz authored and markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    58bc2b0 View commit details
    Browse the repository at this point in the history
  9. First iteration with Sage

    dbogunowicz authored and markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    d538444 View commit details
    Browse the repository at this point in the history
  10. Apply suggestions from code review

    dbogunowicz authored and markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    e19676b View commit details
    Browse the repository at this point in the history
  11. ORT agrees with the Engine. But they both give not entirely correct r…

    …esult. Hey, this is good news still
    dbogunowicz authored and markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    7908b74 View commit details
    Browse the repository at this point in the history
  12. dynamic ORT vs static DS

    dbogunowicz authored and markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    4bc3472 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    c07f7ed View commit details
    Browse the repository at this point in the history
  14. fixes to get static pipeline a little further along

    Benjamin authored and markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    fb77838 View commit details
    Browse the repository at this point in the history
  15. adjust shapes and slicing to enable static autoregressive pass - ISSU…

    …E: tokens past the base seq len are repeated
    Benjamin authored and markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    2097463 View commit details
    Browse the repository at this point in the history
  16. migrate from cache_length to positions input

    Benjamin authored and markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    5eb10a9 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    9213f29 View commit details
    Browse the repository at this point in the history
  18. cleanup the pipeline

    dbogunowicz authored and markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    d9af004 View commit details
    Browse the repository at this point in the history
  19. further cleanup post merge

    dbogunowicz authored and markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    476f25d View commit details
    Browse the repository at this point in the history
  20. Configuration menu
    Copy the full SHA
    fab44e4 View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    d454e2f View commit details
    Browse the repository at this point in the history
  22. Configuration menu
    Copy the full SHA
    1613e25 View commit details
    Browse the repository at this point in the history
  23. Stop saving tmp files, otherwise the engine looks for external files …

    …in the wrong place
    dbogunowicz authored and markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    b61055c View commit details
    Browse the repository at this point in the history
  24. Left pad support

    Benjamin authored and markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    6ee25fc View commit details
    Browse the repository at this point in the history
  25. cleanup

    dbogunowicz authored and markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    5d3004b View commit details
    Browse the repository at this point in the history
  26. cleanup2

    dbogunowicz authored and markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    ace6fa5 View commit details
    Browse the repository at this point in the history
  27. Add in pipeline timing

    markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    388586d View commit details
    Browse the repository at this point in the history
  28. add in force tokens logic

    markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    afd0139 View commit details
    Browse the repository at this point in the history
  29. Configuration menu
    Copy the full SHA
    30eeda7 View commit details
    Browse the repository at this point in the history
  30. Configuration menu
    Copy the full SHA
    5882b56 View commit details
    Browse the repository at this point in the history
  31. Configuration menu
    Copy the full SHA
    4bbe33d View commit details
    Browse the repository at this point in the history
  32. nest input shape override

    markurtz committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    afa5746 View commit details
    Browse the repository at this point in the history
  33. Configuration menu
    Copy the full SHA
    e2bb78c View commit details
    Browse the repository at this point in the history
  34. Configuration menu
    Copy the full SHA
    2299009 View commit details
    Browse the repository at this point in the history

Commits on Jun 9, 2023

  1. Configuration menu
    Copy the full SHA
    2935b77 View commit details
    Browse the repository at this point in the history

Commits on Jun 11, 2023

  1. Configuration menu
    Copy the full SHA
    b89b156 View commit details
    Browse the repository at this point in the history

Commits on Jun 13, 2023

  1. initial commit

    dbogunowicz committed Jun 13, 2023
    Configuration menu
    Copy the full SHA
    dc3d61b View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    a294265 View commit details
    Browse the repository at this point in the history
  3. limit to 150mb

    dbogunowicz committed Jun 13, 2023
    Configuration menu
    Copy the full SHA
    af97f2b View commit details
    Browse the repository at this point in the history
  4. ready to review

    dbogunowicz committed Jun 13, 2023
    Configuration menu
    Copy the full SHA
    c117788 View commit details
    Browse the repository at this point in the history
  5. fix the erronous Makefile

    dbogunowicz committed Jun 13, 2023
    Configuration menu
    Copy the full SHA
    4ad5f49 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    9e816bb View commit details
    Browse the repository at this point in the history
  7. perhaps fixed GHA

    dbogunowicz committed Jun 13, 2023
    Configuration menu
    Copy the full SHA
    f97467f View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    6be8d87 View commit details
    Browse the repository at this point in the history
  9. initial commit

    dbogunowicz committed Jun 13, 2023
    Configuration menu
    Copy the full SHA
    e2f088d View commit details
    Browse the repository at this point in the history
  10. Merge remote-tracking branch 'origin/feature/damian/do_not_save_to_tm…

    …p' into feature/damian/codegen_pipeline_clean
    dbogunowicz committed Jun 13, 2023
    Configuration menu
    Copy the full SHA
    9fc6c64 View commit details
    Browse the repository at this point in the history
  11. tested with actual model

    dbogunowicz committed Jun 13, 2023
    Configuration menu
    Copy the full SHA
    a610faf View commit details
    Browse the repository at this point in the history
  12. remove val_inp argument

    dbogunowicz committed Jun 13, 2023
    Configuration menu
    Copy the full SHA
    347d1fb View commit details
    Browse the repository at this point in the history
  13. Update README.md

    dbogunowicz authored Jun 13, 2023
    Configuration menu
    Copy the full SHA
    e11027c View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    a950910 View commit details
    Browse the repository at this point in the history
  15. Update README.md

    dbogunowicz authored Jun 13, 2023
    Configuration menu
    Copy the full SHA
    c1d02dc View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    711cdfb View commit details
    Browse the repository at this point in the history

Commits on Jun 14, 2023

  1. Configuration menu
    Copy the full SHA
    e602662 View commit details
    Browse the repository at this point in the history

Commits on Jun 16, 2023

  1. [BugFix] Update deepsparse dockerfile (#1069)

    * Remove autoinstall triggering commands
    
    * Fix typo
    rahul-tuli authored and dbogunowicz committed Jun 16, 2023
    Configuration menu
    Copy the full SHA
    2085c37 View commit details
    Browse the repository at this point in the history
  2. initial implementation

    dbogunowicz committed Jun 16, 2023
    Configuration menu
    Copy the full SHA
    2f7bc95 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    e18fab7 View commit details
    Browse the repository at this point in the history
  4. [Fix] Fix CLI benchmark errors (#1071)

    * initial commit
    
    * ready for review
    
    * Update src/deepsparse/utils/onnx.py
    dbogunowicz committed Jun 16, 2023
    Configuration menu
    Copy the full SHA
    0358d87 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    06b5246 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    2cab681 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    63b116b View commit details
    Browse the repository at this point in the history

Commits on Jun 21, 2023

  1. initial commit

    dbogunowicz committed Jun 21, 2023
    Configuration menu
    Copy the full SHA
    cde08b9 View commit details
    Browse the repository at this point in the history

Commits on Jun 22, 2023

  1. Configuration menu
    Copy the full SHA
    99d125c View commit details
    Browse the repository at this point in the history

Commits on Jun 26, 2023

  1. Configuration menu
    Copy the full SHA
    67ffe47 View commit details
    Browse the repository at this point in the history

Commits on Jun 27, 2023

  1. Configuration menu
    Copy the full SHA
    9937686 View commit details
    Browse the repository at this point in the history

Commits on Jun 28, 2023

  1. [KV Cache Interface] DecoderKVCache (#1084)

    * initial implementation
    
    * initial implementation
    
    * Revert "initial implementation"
    
    This reverts commit 765a5f7.
    
    * Merge DecoderKVCache with KVCacheORT (KVCacheORT will not exist, it is just an abstraction)
    
    * rebase
    
    * add tests
    
    * DecoderKVCache that manipulates cache state and additionally passes info to the engine via KVCache object
    
    * improvements after the sync with Mark
    
    * remove prefill
    
    * fix the computation of total cache capacity
    
    * address PR comments
    dbogunowicz authored Jun 28, 2023
    Configuration menu
    Copy the full SHA
    0d6a423 View commit details
    Browse the repository at this point in the history
  2. [WiP] [KV Cache Interface] Text Generation & Decoder Engine Implement…

    …ation (#1089)
    
    * initial commit
    
    * Update src/deepsparse/license.py
    
    * limit to 150mb
    
    * ready to review
    
    * initial commit
    
    * [Codegen][ORT][Static Seq Length] TextGenerationPipeline (#946)
    
    * initial commit
    
    * coreys simplifications
    
    * finishing the second model static
    
    * ready, time for beautification
    
    * ready for review
    
    * moved the code to examples
    
    * fix eos logic
    
    * add argument num_tokens_to_generate
    
    * [CodeGen][Documentation] (#956)
    
    * initial commit
    
    * coreys simplifications
    
    * finishing the second model static
    
    * ready, time for beautification
    
    * ready for review
    
    * moved the code to examples
    
    * fix eos logic
    
    * add argument num_tokens_to_generate
    
    * initial commit
    
    * change order
    
    * Update examples/codegen/README.md
    
    Co-authored-by: corey-nm <109536191+corey-nm@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: corey-nm <109536191+corey-nm@users.noreply.github.com>
    
    * reimplementation for generative pipelines
    
    * restore text generation from examples
    
    * [CodeGen] ONNX model loading to support >2Gb models / two engines (#991)
    
    * refactor sucessfull
    
    * Pipeline fully refactored, time to test engine support. Note: Sliding window not yet implemented!
    
    * First iteration with Sage
    
    * Apply suggestions from code review
    
    * ORT agrees with the Engine. But they both give not entirely correct result. Hey, this is good news still
    
    * dynamic ORT vs static DS
    
    * pipeline handles OPT multitoken pass
    
    * fixes to get static pipeline a little further along
    
    * adjust shapes and slicing to enable static autoregressive pass - ISSUE: tokens past the base seq len are repeated
    
    * migrate from cache_length to positions input
    
    * got if working for multitoken + single token scenario
    
    * cleanup the pipeline
    
    * further cleanup post merge
    
    * Pipeline working for single-token inference only
    
    * do not load the onnx model with external files twice
    
    * pipeline never redundantly saves the external data + more robust tokenizer
    
    * Stop saving tmp files, otherwise the engine looks for external files in the wrong place
    
    * Left pad support
    
    * cleanup
    
    * cleanup2
    
    * Add in pipeline timing
    
    * add in force tokens logic
    
    * remove input validation for text generation pipelines
    
    * remove multitoken support for now
    
    * remove kv cache engine and other fixes
    
    * nest input shape override
    
    * comment out input shape override
    
    * add non batch override for ORT
    
    * clean up generation pipeline
    
    * initial commit
    
    * Update src/deepsparse/license.py
    
    * limit to 150mb
    
    * ready to review
    
    * fix the erronous Makefile
    
    * perhaps fixed GHA
    
    * take into consideration that GHA creates four files
    
    * initial commit
    
    * tested with actual model
    
    * remove val_inp argument
    
    * Update README.md
    
    * Apply suggestions from code review
    
    * Update README.md
    
    * initial implementation
    
    * initial implementation
    
    * Revert "initial implementation"
    
    This reverts commit 765a5f7.
    
    * rebase
    
    * add tests
    
    * strip down complexity out of text generation pipeline
    
    * initial implementation
    
    * In a good state for the review on 22.06
    
    * remove files to make review easier
    
    * Revert "remove files to make review easier"
    
    This reverts commit ea82e99.
    
    * Merge DecoderKVCache with KVCacheORT (KVCacheORT will not exist, it is just an abstraction)
    
    * rebase
    
    * add tests
    
    * Delete decoder_kv_cache.py
    
    * Delete test_decoder_kv_cache.py
    
    * DecoderKVCache that manipulates cache state and additionally passes info to the engine via KVCache object
    
    * fix formatting of the transformers/utils/__init__.py
    
    * improvements after the sync with Mark
    
    * All changes applied, time for testing
    
    * Scaffolding to also run multitoken
    
    * add delay_overwriting_inputs
    
    * multitoken is working (although in limited capacity)
    
    * fix no kv cache inference
    
    * Do not create engine if not needed
    
    * remove the prefill option
    
    * fix docstring
    
    * remove prefill
    
    * fix the computation of total cache capacity
    
    * merge
    
    * addressed PR comments
    
    * quality
    
    ---------
    
    Co-authored-by: corey-nm <109536191+corey-nm@users.noreply.github.com>
    Co-authored-by: Mark Kurtz <mark.kurtz@neuralmagic.com>
    Co-authored-by: Benjamin <ben@neuralmagic.com>
    4 people authored Jun 28, 2023
    Configuration menu
    Copy the full SHA
    0809aea View commit details
    Browse the repository at this point in the history

Commits on Jun 29, 2023

  1. Configuration menu
    Copy the full SHA
    7001a6e View commit details
    Browse the repository at this point in the history
  2. now kv cache decoder holds information about the num of tokens prepro…

    …cessed. also encountered first bug when running with the engine
    dbogunowicz committed Jun 29, 2023
    Configuration menu
    Copy the full SHA
    c1bf5b7 View commit details
    Browse the repository at this point in the history
  3. cleanup the old files

    dbogunowicz committed Jun 29, 2023
    Configuration menu
    Copy the full SHA
    79251e6 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    9efbdb6 View commit details
    Browse the repository at this point in the history
  5. ready for review

    dbogunowicz committed Jun 29, 2023
    Configuration menu
    Copy the full SHA
    da5e93e View commit details
    Browse the repository at this point in the history
  6. ready for testing

    dbogunowicz committed Jun 29, 2023
    Configuration menu
    Copy the full SHA
    a680dac View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    7099994 View commit details
    Browse the repository at this point in the history
  8. Delete example

    dbogunowicz authored Jun 29, 2023
    Configuration menu
    Copy the full SHA
    1d4d96d View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    08e5421 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    bfaa072 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    fbeeb4a View commit details
    Browse the repository at this point in the history

Commits on Jul 3, 2023

  1. Configuration menu
    Copy the full SHA
    f83dcab View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e659c33 View commit details
    Browse the repository at this point in the history
  3. ready for review

    dbogunowicz committed Jul 3, 2023
    Configuration menu
    Copy the full SHA
    cf74ad7 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    853f876 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    e8da07e View commit details
    Browse the repository at this point in the history
  6. quality

    dbogunowicz committed Jul 3, 2023
    Configuration menu
    Copy the full SHA
    58b12c8 View commit details
    Browse the repository at this point in the history

Commits on Jul 5, 2023

  1. Configuration menu
    Copy the full SHA
    eecd232 View commit details
    Browse the repository at this point in the history
  2. Perplexity Eval for Text Generation Models (#1073)

    * initial commit
    
    * Update src/deepsparse/license.py
    
    * limit to 150mb
    
    * ready to review
    
    * initial commit
    
    * [Codegen][ORT][Static Seq Length] TextGenerationPipeline (#946)
    
    * initial commit
    
    * coreys simplifications
    
    * finishing the second model static
    
    * ready, time for beautification
    
    * ready for review
    
    * moved the code to examples
    
    * fix eos logic
    
    * add argument num_tokens_to_generate
    
    * [CodeGen][Documentation] (#956)
    
    * initial commit
    
    * coreys simplifications
    
    * finishing the second model static
    
    * ready, time for beautification
    
    * ready for review
    
    * moved the code to examples
    
    * fix eos logic
    
    * add argument num_tokens_to_generate
    
    * initial commit
    
    * change order
    
    * Update examples/codegen/README.md
    
    Co-authored-by: corey-nm <109536191+corey-nm@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: corey-nm <109536191+corey-nm@users.noreply.github.com>
    
    * reimplementation for generative pipelines
    
    * restore text generation from examples
    
    * [CodeGen] ONNX model loading to support >2Gb models / two engines (#991)
    
    * refactor sucessfull
    
    * Pipeline fully refactored, time to test engine support. Note: Sliding window not yet implemented!
    
    * First iteration with Sage
    
    * Apply suggestions from code review
    
    * ORT agrees with the Engine. But they both give not entirely correct result. Hey, this is good news still
    
    * dynamic ORT vs static DS
    
    * pipeline handles OPT multitoken pass
    
    * fixes to get static pipeline a little further along
    
    * adjust shapes and slicing to enable static autoregressive pass - ISSUE: tokens past the base seq len are repeated
    
    * migrate from cache_length to positions input
    
    * got if working for multitoken + single token scenario
    
    * cleanup the pipeline
    
    * further cleanup post merge
    
    * Pipeline working for single-token inference only
    
    * do not load the onnx model with external files twice
    
    * pipeline never redundantly saves the external data + more robust tokenizer
    
    * Stop saving tmp files, otherwise the engine looks for external files in the wrong place
    
    * Left pad support
    
    * cleanup
    
    * cleanup2
    
    * Add in pipeline timing
    
    * add in force tokens logic
    
    * remove input validation for text generation pipelines
    
    * remove multitoken support for now
    
    * remove kv cache engine and other fixes
    
    * nest input shape override
    
    * comment out input shape override
    
    * add non batch override for ORT
    
    * clean up generation pipeline
    
    * initial commit
    
    * Update src/deepsparse/license.py
    
    * limit to 150mb
    
    * ready to review
    
    * fix the erronous Makefile
    
    * perhaps fixed GHA
    
    * take into consideration that GHA creates four files
    
    * initial commit
    
    * tested with actual model
    
    * remove val_inp argument
    
    * Update README.md
    
    * Apply suggestions from code review
    
    * Update README.md
    
    * [BugFix] Update deepsparse dockerfile (#1069)
    
    * Remove autoinstall triggering commands
    
    * Fix typo
    
    * initial implementation
    
    * working implementation for pipeline input
    
    * [Fix] Fix CLI benchmark errors (#1071)
    
    * initial commit
    
    * ready for review
    
    * Update src/deepsparse/utils/onnx.py
    
    * Clean a typo in the pipeline code
    
    * cleanup the old files
    
    * Update src/deepsparse/transformers/engines/nl_decoder_engine.py
    
    * ready for review
    
    * ready for testing
    
    * assert proper padding on pipeline init
    
    * now also supporting kv cache perplexity. time for cleanup
    
    * ready for review
    
    * correctly print engine info
    
    * work with left padding of the tokenizer
    
    * quality
    
    * fix the multitoken inference
    
    ---------
    
    Co-authored-by: corey-nm <109536191+corey-nm@users.noreply.github.com>
    Co-authored-by: Mark Kurtz <mark.kurtz@neuralmagic.com>
    Co-authored-by: Benjamin <ben@neuralmagic.com>
    Co-authored-by: Rahul Tuli <rahul@neuralmagic.com>
    5 people authored Jul 5, 2023
    Configuration menu
    Copy the full SHA
    10c804a View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    7bd23d6 View commit details
    Browse the repository at this point in the history

Commits on Jul 7, 2023

  1. Configuration menu
    Copy the full SHA
    10ba82e View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e81c327 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    b737f77 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    042cb79 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    bf4eac3 View commit details
    Browse the repository at this point in the history
  6. fix integration tests

    dbogunowicz committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    c8a1f93 View commit details
    Browse the repository at this point in the history

Commits on Jul 10, 2023

  1. initial implementation

    dbogunowicz committed Jul 10, 2023
    Configuration menu
    Copy the full SHA
    d2d3dc1 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    6ce1ca4 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    47dc986 View commit details
    Browse the repository at this point in the history
  4. fix the integration test

    dbogunowicz committed Jul 10, 2023
    Configuration menu
    Copy the full SHA
    ef77d91 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    f0d74b0 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    186c80c View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    09993e7 View commit details
    Browse the repository at this point in the history

Commits on Jul 11, 2023

  1. Configuration menu
    Copy the full SHA
    ba8c126 View commit details
    Browse the repository at this point in the history
  2. Update src/deepsparse/transformers/engines/nl_decoder_engine.py

    Co-authored-by: Rahul Tuli <rahul@neuralmagic.com>
    dbogunowicz and rahul-tuli authored Jul 11, 2023
    Configuration menu
    Copy the full SHA
    37e8a02 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    0d308b9 View commit details
    Browse the repository at this point in the history

Commits on Jul 12, 2023

  1. Configuration menu
    Copy the full SHA
    41e9306 View commit details
    Browse the repository at this point in the history