Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Claude 3.7 Not following directions. #7268

Open
1 task done
amirshawn opened this issue Mar 15, 2025 · 6 comments
Open
1 task done

[Bug]: Claude 3.7 Not following directions. #7268

amirshawn opened this issue Mar 15, 2025 · 6 comments
Labels
bug Something isn't working

Comments

@amirshawn
Copy link

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Describe the bug and reproduction steps

Ever since the change to Claude 3.7, the coding quality has gone way up and coding mistakes have gone way down. I am having an issue with Claude not listening to directions. This isn't an isolated incedent, it's happening consistently. I am wondering if adjusting the LLM temperature would help? What is the suggested way of doing that? Has anyone else noticed this lately? I've asked it not to do something multiple times and it proceeded to do it 5 times and each time it even acknowledged afterwards that it did what it wasn't supposed to. It's very odd. I'm wondering if the system prompts might aren't giving enough emphasis on following directions. When I asked it why it repeated the same mistake over and over, it responded that it is mistakenly following it's training over the user instructions. If anyone has any ideas on how to get Claude back under control, it would be much appreciated.

OpenHands Installation

Docker command in README

OpenHands Version

No response

Operating System

None

Logs, Errors, Screenshots, and Additional Context

No response

@amirshawn amirshawn added the bug Something isn't working label Mar 15, 2025
@amirshawn
Copy link
Author

I'm not sure about this but I wanted to mention that 0.28 seemed to work better than 0.28.1. Not sure if it's a coincidence or maybe they made some changes to Claude but I've noticed a difference.

@amirshawn
Copy link
Author

I realized that memory condensation got turned on when I switched to using dev mode which once again made OpenHands unusable in real life. When I was looking at the config.template.toml I saw there are a lot of settings. I'm wondering if maybe I don't have it set up correctly. I don't have any settings selected for it.

@amirshawn
Copy link
Author

Just to follow up, undoing the condensation helps but it's still very unruly. It used to take a couple messages before it would respond and do what I ask. Now it will completely ignore instructions even if I ask multiple times. It must have to do with claude 3.7

@enyst
Copy link
Collaborator

enyst commented Mar 16, 2025

Has anyone else noticed this lately? I've asked it not to do something multiple times and it proceeded to do it 5 times and each time it even acknowledged afterwards that it did what it wasn't supposed to. It's very odd. I'm wondering if the system prompts might aren't giving enough emphasis on following directions.

Yes, I have seen this in multiple places (with claude 3.7, not with openhands necessarily), it really is very jumpy and goes off doing stuff, and that stuff is not necessarily what the user said.

Personally, I try now to give it the FIRST message as clear as possible, and containing everything important. It may sound obvious, but TBH I haven't always done that - other times I was starting with a small thing, then add another small thing etc., and it was working; with 3.7 and openhands today I think the first option works significantly better.

I'm not sure about this but I wanted to mention that 0.28 seemed to work better than 0.28.1. Not sure if it's a coincidence or maybe they made some changes to Claude but I've noticed a difference.

We made a system prompt update in 0.28.1 I think 🤔
Significant, with a lot of changes, and the same to tools:
[agent] system message

The point in part was to adapt better to 3.7. Maybe we haven't enough, or maybe too much?

@enyst
Copy link
Collaborator

enyst commented Mar 16, 2025

To add, I also need to watch it, to see when it goes off on a tangent and stop it. I wasn't concerned about that before 3.7, maybe because I felt it wasn't going to go for long I guess.

Now with 3.7, it's like it had way too many coffees 😂

@tholum
Copy link

tholum commented Mar 17, 2025

I have found the same thing with 3.7 in general, I have found better luck ( Both in Cursor and Openhands ) adding stuff like "Only implement the what is asked, don't go ahead of what is stated, Don't assume, Follow the spec's Don't deviate from the instructions, Don't reivent the wheel use standards
" to my prompts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants