It’s actually not really making results coming from chatGPT API much faster, but making users feel things are moving much faster. So the user experience is a lot better.
The key is to enable streamming mode when calling the chatGPT API.
Here is the example code:
# Example of an OpenAI ChatCompletion request with stream=True # https://platform.openai.com/docs/guides/chat
# a ChatCompletion request response = openai.ChatCompletion.create( model='gpt-3.5-turbo', messages=[ {'role': 'user', 'content': "What's 1+1? Answer in one word."} ], temperature=0, stream=True # this time, we set stream=True )
for chunk in response: print(chunk)
As you can see, the key is to add one more parameter as stream=True. And here is the result from the above code:
Reprint policy:
All articles in this blog are used except for special statements
CC BY 4.0
reprint policy. If reproduced, please indicate source
robot learner
!