Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undeclared dependency on sentencepiece #1475

Open
nchammas opened this issue Mar 5, 2025 · 0 comments
Open

Undeclared dependency on sentencepiece #1475

nchammas opened this issue Mar 5, 2025 · 0 comments
Labels

Comments

@nchammas
Copy link
Contributor

nchammas commented Mar 5, 2025

Describe the issue as clearly as possible:

I am trying to run the grammar-structured generation example from here and it is failing due to what appears to be an undeclared dependency on sentencepiece.

Steps/code to reproduce the bug:

Create a new environment and install outlines[transformers] at version 0.2.1.

Then create a file with the example code taken straight from the docs:

from outlines import models, generate

arithmetic_grammar = """
    ?start: expression

    ?expression: term (("+" | "-") term)*

    ?term: factor (("*" | "/") factor)*

    ?factor: NUMBER
           | "-" factor
           | "(" expression ")"

    %import common.NUMBER
"""

model = models.transformers("WizardLM/WizardMath-7B-V1.1")
generator = generate.cfg(model, arithmetic_grammar)
sequence = generator(
  "Alice had 4 apples and Bob ate 2. "
  + "Write an expression for Alice's apples:"
)

print(sequence)
# (8-2)

Expected result:

Running this should output some arithmetic expression.

Error message:

$ python example.py
Loading checkpoint shards: 100%|████████████████████████████| 2/2 [00:40<00:00, 20.10s/it]
Traceback (most recent call last):
  File ".../.venv/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1863, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.pyenv/versions/3.11.11/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 940, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File ".../.venv/lib/python3.11/site-packages/transformers/models/llama/tokenization_llama.py", line 27, in <module>
    import sentencepiece as spm
ModuleNotFoundError: No module named 'sentencepiece'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File ".../example.py", line 17, in <module>
    model = models.transformers("WizardLM/WizardMath-7B-V1.1")
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.11/site-packages/outlines/models/transformers.py", line 435, in transformers
    return Transformers(model, tokenizer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.11/site-packages/outlines/models/transformers.py", line 138, in __init__
    self.tokenizer = TransformerTokenizer(tokenizer)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.11/site-packages/outlines/models/transformers.py", line 80, in __init__
    self.is_llama = isinstance(self.tokenizer, get_llama_tokenizer_types())
                                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.11/site-packages/outlines/models/transformers.py", line 27, in get_llama_tokenizer_types
    from transformers.models.llama import LlamaTokenizer
  File "<frozen importlib._bootstrap>", line 1229, in _handle_fromlist
  File ".../.venv/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1851, in __getattr__
    module = self._get_module(self._class_to_module[name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1865, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import transformers.models.llama.tokenization_llama because of the following error (look up to see its traceback):
No module named 'sentencepiece'

Outlines/Python version information:

Version information

``` $ python -c "from outlines import _version; print(_version.version)"; python -c "import sys; print('Python', sys.version)"; pip freeze 0.2.1 Python 3.11.11 (main, Feb 27 2025, 15:20:28) [Clang 16.0.0 (clang-1600.0.26.6)] accelerate==1.4.0 aiohappyeyeballs==2.4.8 aiohttp==3.11.13 aiosignal==1.3.2 airportsdata==20250224 annotated-types==0.7.0 attrs==25.1.0 certifi==2025.1.31 cfgv==3.4.0 charset-normalizer==3.4.1 cloudpickle==3.1.1 datasets==3.3.2 dill==0.3.8 diskcache==5.6.3 distlib==0.3.9 filelock==3.17.0 frozenlist==1.5.0 fsspec==2024.12.0 genson==1.3.0 huggingface-hub==0.29.2 identify==2.6.8 idna==3.10 interegular==0.3.3 iso3166==2.1.1 Jinja2==3.1.5 jsonschema==4.23.0 jsonschema-specifications==2024.10.1 lark==1.2.2 MarkupSafe==3.0.2 mpmath==1.3.0 multidict==6.1.0 multiprocess==0.70.16 nest-asyncio==1.6.0 networkx==3.4.2 nodeenv==1.9.1 numpy==1.26.4 outlines==0.2.1 outlines_core==0.1.26 packaging==24.2 pandas==2.2.3 platformdirs==4.3.6 pre_commit==4.1.0 propcache==0.3.0 psutil==7.0.0 pyarrow==19.0.1 pydantic==2.10.6 pydantic_core==2.27.2 python-dateutil==2.9.0.post0 pytz==2025.1 PyYAML==6.0.2 referencing==0.36.2 regex==2024.11.6 requests==2.32.3 rpds-py==0.23.1 safetensors==0.5.3 six==1.17.0 sympy==1.13.1 tokenizers==0.21.0 torch==2.6.0 tqdm==4.67.1 transformers==4.49.0 typing_extensions==4.12.2 tzdata==2025.1 urllib3==2.3.0 virtualenv==20.29.2 xxhash==3.5.0 yarl==1.18.3 ```

Context for the issue:

It's easy enough to manually install sentencepiece and fix this, but if outlines depends on this library then it should declare that dependency properly so pip/poetry/uv/etc. do the right thing automatically.

@nchammas nchammas added the bug label Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant