Skip to content

Conversation

@pull
Copy link

@pullpullbot commented Nov 2, 2023

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

@pullpullbot added the ⤵️ pull label Nov 2, 2023
@pullpullbot added the merge-conflict Resolve conflicts manually label Nov 4, 2023
@abetlenabetlenforce-pushed the main branch 5 times, most recently from 4408d7a to cc0fe43CompareNovember 14, 2023 20:30
abetlenand others added 21 commits June 4, 2024 00:49
* Templates sometimes have BOS in them, remove duplicate * tokenize chat format prompts before completion This is to ensure that we don't duplicate any special tokens. Hopefully I amended the existing formats correctly? * updated comment * corrected a few * add some missing internals * proper bos/eos detection * just let tokenizer do the job * typo-- * align test with new response * changed to a warning * move to another PR * Use python warnings module --------- Co-authored-by: Andrei Betlen <[email protected]>
* Fix lobprobs when BOS is not present * Fix logprobs when bos is not available
* passthru rpc_servers params wip * enable llama rpc by default * convert string to byte * add rpc package * Revert "enable llama rpc by default" This reverts commit 832c6dd. * update readme * Only set rpc_servers when provided * Add rpc servers to server options --------- Co-authored-by: Andrei Betlen <[email protected]>
* Support SPM infill * typo-- * one less layer of parenthesis necessary * new required internals * manually add bos/eos if model requires it * add bos even when unknown This is identical behaviour to llama.cpp I guess any model that doesn't use BOS is recent enough to have the add_bos_token metadata. * don't add bos/eos on non-infill pre-tokenized prompt * add tokenizer hack to remove leading space in suffix * I keep forgetting metadata are strings * check if bos exists * add example * add cls/sep instead of bos/eos for WPM vocab * simplify * color-code filtered suffix --------- Co-authored-by: Andrei Betlen <[email protected]>
… from memory (#1513) * feat: add explicit methods to free model This commit introduces a `close` method to both `Llama` and `_LlamaModel`, allowing users to explicitly free the model from RAM/VRAM. The previous implementation relied on the destructor of `_LlamaModel` to free the model. However, in Python, the timing of destructor calls is unclear—for instance, the `del` statement does not guarantee immediate invocation of the destructor. This commit provides an explicit method to release the model, which works immediately and allows the user to load another model without memory issues. Additionally, this commit implements a context manager in the `Llama` class, enabling the automatic closure of the `Llama` object when used with the `with` statement. * feat: Implement ContextManager in _LlamaModel, _LlamaContext, and _LlamaBatch This commit enables automatic resource management by implementing the `ContextManager` protocol in `_LlamaModel`, `_LlamaContext`, and `_LlamaBatch`. This ensures that resources are properly managed and released within a `with` statement, enhancing robustness and safety in resource handling. * feat: add ExitStack for Llama's internal class closure This update implements ExitStack to manage and close internal classes in Llama, enhancing efficient and safe resource management. * Use contextlib ExitStack and closing * Explicitly free model when closing resources on server --------- Co-authored-by: Andrei Betlen <[email protected]>
Bumps [pypa/cibuildwheel](https://github.com/pypa/cibuildwheel) from 2.18.1 to 2.19.0. - [Release notes](https://github.com/pypa/cibuildwheel/releases) - [Changelog](https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md) - [Commits](pypa/cibuildwheel@v2.18.1...v2.19.0) --- updated-dependencies: - dependency-name: pypa/cibuildwheel dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Update build-wheels-cuda.yaml * Update build-wheels-cuda.yaml * revert * Bump pyhton from 3.8 to 3.9 * Remove python 3.8 * Remove Python 3.7 and 3.8 deprecated * Bump python from 3.8 to 3.9 * Add python 3.9 * Add python 3.9, remove macos-11 deprecated, add macos-14 * Bump python 3.8 to 3.9 * Add python 3.13 * Add python 3.13 * python 3.13 remove * remove python 3.13 * remove python 3.8 * Bump macos-13 to macos-14 * Update build-wheels-metal.yaml * Update build-wheels-metal.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-wheels-metal.yaml * Update generate-index-from-release.yaml Add avx, avx2 and avx512 * Update test.yaml * Update test-pypi.yaml * Update publish.yaml * Update publish-to-test.yaml * Update build-wheels-cuda.yaml Cuda with AVX2 by default * Update build-wheels-cuda.yaml * remove DEPRECATED 32 bits * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml Upgrade matrix os to latest version * Update build-wheels-metal.yaml * Update build-wheels-cuda.yaml * Update test.yaml * Update test-pypi.yaml * Update test.yaml Add cache: 'pip' * Update publish-to-test.yaml * Update build-wheels-metal.yaml Add cache: 'pip' * Update build-wheels-cuda.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-wheels-metal.yaml remove x86_64 * Update build-wheels-metal.yaml * Update build-and-release.yaml * Update build-wheels-metal.yaml * Update build-wheels-metal.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-wheels-metal.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-and-release.yaml * Update build-wheels-metal.yaml * revert * Remove cpu variants --------- Co-authored-by: Andrei Betlen <[email protected]>
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⤵️ pullmerge-conflictResolve conflicts manually

Projects

None yet

Development

Successfully merging this pull request may close these issues.

20 participants

@abetlen@CISC@a-ghorbani@chraac@jkawamoto@Smartappli@jncraton@oobabooga@yentur@grider-withourai@ericcurtin@ddh0@mjschock@mashuk999@yurivict@shamitv@tc-wolf@ExtReMLapin@xu-song@benHeid