explicit cast messages to string for RAG purposes#1080
Open
nivibilla wants to merge 2 commits intoabetlen:mainfrom
Open
explicit cast messages to string for RAG purposes#1080nivibilla wants to merge 2 commits intoabetlen:mainfrom
nivibilla wants to merge 2 commits intoabetlen:mainfrom
Conversation
Owner
|
Hey @nivibilla can you provide a log of the messages that are sent that are causing this issue? The type hints should be correct there so any need to cast to str is likely another bug (probably in my code but just curious where it originates). |
|
In my case: |
|
I have the same problem, I used followed command start serve python3 -m llama_cpp.server --model /app/vlm_weights/MiniCPM-V-2_6-gguf/ggml-model-Q2_K.gguf --n_gpu_layers -1then I used example clinet code with from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1/", api_key="llama.cpp")
response = client.chat.completions.create(
model="gpt-4-vision-preview",
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://user-images.githubusercontent.com/1991296/230134379-7181e485-c521-4d23-a0d6-f7b3b61ba524.png",
},
},
{
"type": "text",
"text": "What does the image say. Format your response as a json object with a single 'text' key.",
},
],
}
],
response_format={
"type": "json_object",
"schema": {"type": "object", "properties": {"text": {"type": "string"}}},
},
)
import json
print(json.loads(response.choices[0].message.content))In my serve terminal I have INFO: Started server process [526705]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://localhost:8000 (Press CTRL+C to quit)
Exception: can only concatenate str (not "list") to str
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/llama_cpp/server/errors.py", line 171, in custom_route_handler
response = await original_route_handler(request)
File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 301, in app
raw_response = await run_endpoint_function(
File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 212, in run_endpoint_function
return await dependant.call(**values)
File "/usr/local/lib/python3.10/dist-packages/llama_cpp/server/app.py", line 513, in create_chat_completion
] = await run_in_threadpool(llama.create_chat_completion, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/starlette/concurrency.py", line 39, in run_in_threadpool
return await anyio.to_thread.run_sync(func, *args)
File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 859, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama.py", line 1898, in create_chat_completion
return handler(
File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama_chat_format.py", line 564, in chat_completion_handler
result = chat_formatter(
File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama_chat_format.py", line 229, in __call__
prompt = self._environment.render(
File "/usr/local/lib/python3.10/dist-packages/jinja2/environment.py", line 1304, in render
self.environment.handle_exception()
File "/usr/local/lib/python3.10/dist-packages/jinja2/environment.py", line 939, in handle_exception
raise rewrite_traceback_stack(source=source)
File "<template>", line 4, in top-level template code
TypeError: can only concatenate str (not "list") to str |
|
Setting the llm = Llama(
# ...
chat_format="chatml",
# ...
) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hi,
Ive been testing the openai-like-server with Ollama-webui and when using the rag pipeline, the code returns a list. Not a string and so I get this error from the llamacpp-python-server
Simply force casting messages to string before appending the chat format fixes this issue. If this is not the right way pls let me know what I can do to solve this issue.
Thanks