- Notifications
You must be signed in to change notification settings - Fork 4.5k
Description
Confirm this is an issue with the Python library and not an underlying OpenAI API
- This is an issue with the Python library
Describe the bug
I encountered the problem of usage.prompt_tokens_details=None.
This is the output of response.usage:CompletionUsage(completion_tokens=57, prompt_tokens=2181, total_tokens=2518, completion_tokens_details=None, prompt_tokens_details=None, reasoning_tokens=280, traffic_type='ON_DEMAND', promptTokensDetails=[{'modality': 'TEXT', 'tokenCount': 2181}], candidatesTokensDetails=[{'modality': 'TEXT', 'tokenCount': 57}])
Docs for Prompting caching: https://platform.openai.com/docs/guides/prompt-caching
Requirements
Caching is available for prompts containing 1024 tokens or more, with cache hits occurring in increments of 128 tokens. Therefore, the number of cached tokens in a request will always fall within the following sequence: 1024, 1152, 1280, 1408, and so on, depending on the prompt's length.
All requests, including those with fewer than 1024 tokens, will display a cached_tokens field of the usage.prompt_tokens_details Response object or Chat object indicating how many of the prompt tokens were a cache hit. For requests under 1024 tokens, cached_tokens will be zero.
"usage":{"prompt_tokens": 2006, "completion_tokens": 300, "total_tokens": 2306, "prompt_tokens_details":{"cached_tokens": 1920 }, "completion_tokens_details":{"reasoning_tokens": 0, "accepted_prediction_tokens": 0, "rejected_prediction_tokens": 0 } } To Reproduce
My openai version is 1.99.6:completion = client.chat.completions.create( model="gemini-2.5-pro", messages=messages, tools=available_tools, tool_choice="auto", max_tokens=20000, extra_headers={"X-TT-LOGID": ""} )
Code snippets
OS
Linux
Python version
Python 3.11.2
Library version
openai v1.99.6