Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue: seed=-1 in llama_cpp.Llama Does Not Ensure Randomness #1473

Open
thongtr-dev opened this issue Mar 4, 2025 · 1 comment
Open

Issue: seed=-1 in llama_cpp.Llama Does Not Ensure Randomness #1473

thongtr-dev opened this issue Mar 4, 2025 · 1 comment

Comments

@thongtr-dev
Copy link

When using llama_cpp.Llama with seed=-1, the generated output remains identical across multiple runs, despite expectations that -1 should introduce randomness. Even after modifying sampling parameters (temperature, top_k, top_p) and restarting the script, the model continues to produce the same structured content.

Steps to Reproduce:

  1. Load a GGUF model using llama_cpp.Llama with seed=-1.
  2. Use Outlines’ generate.json() with a structured schema.
  3. Run the script multiple times and compare outputs.
  4. Modify sampling settings (e.g., temperature=1.2, top_k=80, top_p=0.7), but observe little to no change in output content.
  5. Even after restarting the script or system, the issue persists.

Expected Behavior:
Each run should produce unique exam content when using seed=-1, assuming it enables true randomness.

Observed Behavior:
The generated output remains unchanged across runs, with only minor formatting differences (e.g., whitespace variations).

Possible Workarounds Attempted (Without Success):

  • Explicitly setting seed=random.randint(0, 2**32 - 1).
  • Tweaking the input prompt dynamically.
  • Increasing sampling randomness with top_k, top_p, and temperature.
  • Restarting the script/system to clear potential caches.

Here's the code:

from outlines import models, generate, samplers
from llama_cpp import Llama
import os
import json

from pydantic import BaseModel, Field, field_validator
from typing import List, Literal


class Question(BaseModel):
    question_text: str
    options: List[str] = Field(..., min_length=4, max_length=4)
    correct_option: int = Field(..., ge=0, le=3)

    @field_validator("options")
    def check_options_length(cls, v):
        if len(v) != 4:
            raise ValueError("Each question must have exactly 4 options")
        return v

    @field_validator("correct_option")
    def check_correct_option(cls, v, values):
        options = values.data.get("options", [])
        if v not in range(len(options)):
            raise ValueError("correct_option must be an integer between 0 and 3")
        return v


class Section(BaseModel):
    section: Literal[1, 2, 3, 4, 5, 6]
    section_name: Literal[
        "Cloze Grammar Vocabulary",
        "Cloze Contextual Vocabulary",
        "Best Arrangement of Utterances",
        "Cloze Informational Comprehension",
        "Reading Comprehension",
        "Reading Comprehension Advanced",
    ]
    passage_text: str
    questions: List[Question]


class ExamSchema(BaseModel):
    sections: List[Section] = Field(..., min_length=6, max_length=6)


exam_schema_json = json.dumps(ExamSchema.model_json_schema())

# Load the Llama model with improved sampling
llm = Llama(
    model_path=os.path.join(os.getcwd(), "src", "models", "Mistral-7B-Instruct-v0.3.Q4_K_M.gguf"),
    n_threads=8,
    n_gpu_layers=0,
    seed=-1,
)

model = models.LlamaCpp(llm)

sampler = samplers.multinomial(1, temperature=1.0)

generator = generate.json(model, exam_schema_json, sampler)

exam_stream = generator.stream(
    "You are an English teacher preparing an exam for Vietnamese students. "
    "Ensure the questions cover a variety of topics and difficulty levels. "
    "Each question must be unique and well-structured.\nOutput:",
    max_tokens=None,
    stop_at=["Q:", "\n"],
)

for stream in exam_stream:
    print(stream)

The first output stream:

{
 "
sections
":
 [
 {
 "
section
":
 
1
,
 "
section

name
":
 "
Read
ing
 Com
pre
hens
ion
",
 "
pass
age

text
":
 "
Now
adays
,
 a
 big
 change
 is
 taking
 place
 in
 the
 way
 we
 write
 and
 consume
 stories
.
 E

The second output stream:

{
 "
sections
":
 [
 {
 "
section
":
 
1
,
 "
section
_
name
":
 "
Read
ing
 Com
pre
hens
ion
",
 "
pass
age
_
text
":
 "
Now
adays
,
 a
 big
 change
 is
 taking
 place
 in
 the
 way
 we
› write
^R
 and
 consume
 stories

Both outputs contain the phrase: “Nowadays, a big change is taking place in the way we write and consume stories...”

@thongtr-dev
Copy link
Author

The model is Mistral 7B Instruct v0.3 Q4_K_M GGUF from https://huggingface.co/MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant