- Notifications
You must be signed in to change notification settings - Fork 14.2k
webui: Add a "Continue" Action for Assistant Message#16971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
Conversation
allozaur commented Nov 3, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
allozaur left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ngxson@ggerganov lemme know if you think that this logic works for handling the "Continue" action for assistant messages.
@artyfacialintelagent fell free to test this out and give feedback!
ggerganov commented Nov 3, 2025
Is this supposed to work correctly when pressing Continue after stopping a response while it is generating? I am testing with |
allozaur commented Nov 3, 2025
I've tested it for the edited assistant responses so far. I will take a close look at the stopped generation -> continue flow as well |
Iq1pl commented Nov 5, 2025
When using gpt-oss in Lm Studio the model generates a new response instead of continuing the previous text, this is because of the Harmony parser, uninstalling it resolves this and the model continues the generation successfully. |
f4c3aeb to b8e4bb4Compareallozaur commented Nov 12, 2025
@ggerganov please check the demos i've attached to the PR description and also test this feature on your end. looking forward to your feedback! |
b8e4bb4 to e0d03e2CompareUh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
4741f81 to c7e23c7Compareggerganov commented Nov 13, 2025
Hm, I wonder why do it like this. We already have support on the server to continue the assistant message if it is the last one in the request #13174: The current approach often does not continue properly, as can be seen in the sample videos: ![]() Using the assistant prefill functionality above would make this work correctly in all cases. |
ngxson commented Nov 13, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Agree with @ggerganov , it's better to use the prefill assistant message from #13174 Just one thing to note though, I think most templates does not support formatting the reasoning content back to original, so probably that's the only case where it will break |
allozaur commented Nov 13, 2025
Thanks guys, I missed that! Will patch it and come back to you. |
allozaur commented Nov 13, 2025
I've updated the logic with 859e496 and i have tested with few models and only 1 (
|
ggerganov commented Nov 13, 2025
For me, both Qwen3 and Gemma3 are able to complete successfully. For example, here is Gemma3 12B IT: webui-continue-0.mp4It's strange that it didn't work for you. Regarding gpt-oss - I think that "Continue" has to also send the reasoning in this case. Currently, it is discarded and I think it confuses the model. |
allozaur commented Nov 13, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Should we then address the thinking models differently for now, at least from the WebUI perspective?
I will do some more testing with other instruct models and make sure all is working right. |
ngxson commented Nov 13, 2025
It's likely due to chat template, I suspect some chat templates (especially jinja) adds the generation prompt. Can you verify how the chat template looks like with |
ggerganov commented Nov 13, 2025
If it's not too complicated, I'd say change the logic so that "Continue" includes the reasoning of the last assistant message for all reasoning models. |
ngxson commented Nov 13, 2025
The main issue is that some chat templates actively suppress the reasoning content from assistant messages, so I'm doubt if it will work cross all model. Actually I'm thinking about a more generic approach, we can implement a feature in the backend such that both the "raw" generated text (i.e. with I would say for now, we can put a warning in the webui to tell user that this feature is experimental and doesn't work cross all models. We can improve it later if it gets more usage. |
ngxson commented Nov 17, 2025
allozaur commented Nov 17, 2025
ngxson commented Nov 17, 2025
I think it should behave like nothing is added, without any error message. IIRC LM Studio has the same behavior. |
ngxson commented Nov 17, 2025
yeah a message like that can also be a good solution |
…g the conversation payload ending with assistant message
d8f952d to 8288ca7Compareallozaur commented Nov 18, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
ngxson left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, it works as expected now
99c53d6 into ggml-org:masterUh oh!
There was an error while loading. Please reload this page.
* feat: Add "Continue" action for assistant messages * feat: Continuation logic & prompt improvements * chore: update webui build output * feat: Improve logic for continuing the assistant message * chore: update webui build output * chore: Linting * chore: update webui build output * fix: Remove synthetic prompt logic, use the prefill feature by sending the conversation payload ending with assistant message * chore: update webui build output * feat: Enable "Continue" button based on config & non-reasoning model type * chore: update webui build output * chore: Update packages with `npm audit fix` * fix: Remove redundant error * chore: update webui build output * chore: Update `.gitignore` * fix: Add missing change * feat: Add auto-resizing for Edit Assistant/User Message textareas * chore: update webui build output
ServeurpersoCom commented Nov 20, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Tested during agentic loop also :) Stop-And-Start-With-MCP.mp4 |


Close#16097
Add Continue and Save features for chat messages
What's new
Continue button for assistant messages
Save button when editing user messages
Technical notes
Demos
ggml-org/gpt-oss-20b-GGUFdemo1.mp4
unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUFdemo2.mp4