gpt4free/vision.md at 9918df98b371684bd7d2ff906405432938b6efc4

mirror of https://github.com/xtekky/gpt4free.git synced 2024-12-23 11:02:40 +03:00

* Add multiple images support

* Add multiple images support in gui

* Support multiple images in legacy client and in the api
Fix some model names in provider model list

* Fix unittests

* Add vision and providers docs

2024-12-13 22:20:58 +01:00

2.5 KiB

Raw Blame History

Vision Support in Chat Completion

This documentation provides an overview of how to integrate vision support into chat completions using an API and a client. It includes examples to guide you through the process.

Example with the API

To use vision support in chat completion with the API, follow the example below:

import requests
import json
from g4f.image import to_data_uri
from g4f.requests.raise_for_status import raise_for_status

url = "http://localhost:8080/v1/chat/completions"
body = {
    "model": "",
    "provider": "Copilot",
    "messages": [
        {"role": "user", "content": "what are on this image?"}
    ],
    "images": [
        ["data:image/jpeg;base64,...", "cat.jpeg"]
    ]
}
response = requests.post(url, json=body, headers={"g4f-api-key": "secret"})
raise_for_status(response)
print(response.json())

In this example:

url is the endpoint for the chat completion API.
body contains the model, provider, messages, and images.
messages is a list of message objects with roles and content.
images is a list of image data in Data URI format and optional filenames.
response stores the API response.

Example with the Client

To use vision support in chat completion with the client, follow the example below:

import g4f
import g4f.Provider

def chat_completion(prompt):
    client = g4f.Client(provider=g4f.Provider.Blackbox)
    images = [
        [open("docs/images/waterfall.jpeg", "rb"), "waterfall.jpeg"],
        [open("docs/images/cat.webp", "rb"), "cat.webp"]
    ]
    response = client.chat.completions.create([{"content": prompt, "role": "user"}], "", images=images)
    print(response.choices[0].message.content)

prompt = "what are on this images?"
chat_completion(prompt)

**Image 1**

* A waterfall with a rainbow
* Lush greenery surrounding the waterfall
* A stream flowing from the waterfall

**Image 2**

* A white cat with blue eyes
* A bird perched on a window sill
* Sunlight streaming through the window