CARVIEW |
Select Language
HTTP/2 200
date: Sun, 12 Oct 2025 08:19:33 GMT
server: Fly/6f91d33b9d (2025-10-08)
content-type: text/html; charset=utf-8
content-encoding: gzip
via: 2 fly.io, 2 fly.io
fly-request-id: 01K7BR1TV6D9FQ1FEMCSNFT610-bom
Using the ChatGPT streaming API from Python | Simon Willison’s TILs
Using the ChatGPT streaming API from Python
I wanted to stream the results from the ChatGPT API as they were generated, rather than waiting for the entire thing to complete before displaying anything.
Here's how to do that with the openai Python library:
import openai
openai.api_key = open("/Users/simon/.openai-api-key.txt").read().strip()
for chunk in openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{
"role": "user",
"content": "Generate a list of 20 great names for sentient cheesecakes that teach SQL"
}],
stream=True,
):
content = chunk["choices"][0].get("delta", {}).get("content")
if content is not None:
print(content, end='')
Running this code in a Jupyter notebook does the following:
Using async/await
The OpenAI Python library can also work with asyncio
. Here's how to do the above using their async/await
support - with the .acreate()
method:
async for chunk in await openai.ChatCompletion.acreate(
model="gpt-3.5-turbo",
messages=[{
"role": "user",
"content": "Generate a list of 20 great names for sentient cheesecakes that teach SQL"
}],
stream=True,
):
content = chunk["choices"][0].get("delta", {}).get("content")
if content is not None:
print(content, end='')
Those chunks
Here's what those chunks look like - the first two and then the last two:
{
"choices": [
{
"delta": {
"role": "assistant"
},
"finish_reason": null,
"index": 0
}
],
"created": 1680380941,
"id": "chatcmpl-70c8LVUSYoSbdQTyONgJfcVU542wO",
"model": "gpt-3.5-turbo-0301",
"object": "chat.completion.chunk"
}
{
"choices": [
{
"delta": {
"content": "1"
},
"finish_reason": null,
"index": 0
}
],
"created": 1680380941,
"id": "chatcmpl-70c8LVUSYoSbdQTyONgJfcVU542wO",
"model": "gpt-3.5-turbo-0301",
"object": "chat.completion.chunk"
}
# ... lots more here ...
{
"choices": [
{
"delta": {
"content": "ina"
},
"finish_reason": null,
"index": 0
}
],
"created": 1680380941,
"id": "chatcmpl-70c8LVUSYoSbdQTyONgJfcVU542wO",
"model": "gpt-3.5-turbo-0301",
"object": "chat.completion.chunk"
}
{
"choices": [
{
"delta": {},
"finish_reason": "stop",
"index": 0
}
],
"created": 1680380941,
"id": "chatcmpl-70c8LVUSYoSbdQTyONgJfcVU542wO",
"model": "gpt-3.5-turbo-0301",
"object": "chat.completion.chunk"
}
Related
- gpt3 Using OpenAI functions and their Python library for data extraction - 2023-07-09
- llms How streaming LLM APIs work - 2024-09-21
- gpt3 A simple Python wrapper for the ChatGPT API - 2023-03-02
- json Processing a stream of chunks of JSON with ijson - 2023-08-15
- httpx Logging OpenAI API requests and responses using HTTPX - 2024-01-26
- llms Expanding ChatGPT Code Interpreter with Python packages, Deno and Lua - 2023-04-30
- llms Training nanoGPT entirely on content from my blog - 2023-02-09
- llms Running OpenAI's large context models using llm - 2023-06-13
- llms Using llama-cpp-python grammars to generate JSON - 2023-09-12
- python Calculating embeddings with gtr-t5-large in Python - 2023-01-31
Created 2023-04-01T13:31:53-07:00, updated 2023-04-01T13:44:07-07:00 · History · Edit