docs: transform vectordb page into subdirectory
[DOCS-1197]: https://hasurahq.atlassian.net/browse/DOCS-1197?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ PR-URL: https://github.com/hasura/graphql-engine-mono/pull/10078 Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> GitOrigin-RevId: e1e218318c115c469b7e90cd4b2fbb4b2bbbd767
4
docs/docs/databases/vector-databases/_category_.json
Normal file
@ -0,0 +1,4 @@
|
||||
{
|
||||
"label": "Vector Databases",
|
||||
"position": 1.45
|
||||
}
|
60
docs/docs/databases/vector-databases/index.mdx
Normal file
@ -0,0 +1,60 @@
|
||||
---
|
||||
sidebar_label: Vector Databases
|
||||
keywords:
|
||||
- hasura
|
||||
- docs
|
||||
- databases
|
||||
- vector databases
|
||||
- ai
|
||||
- machine learning
|
||||
sidebar_position: 1
|
||||
---
|
||||
|
||||
# How Does Hasura Work with Vector Databases?
|
||||
|
||||
## What are vectors?
|
||||
|
||||
Vectors are mathematical representations for unstructured data like text, audio, or video data. Vectors generated from
|
||||
deep neural network models like large language models (LLMs) are of a high-dimension to capture multiple latent
|
||||
features, which can be then used to classify text, or cluster related text.
|
||||
|
||||
Word vectors are numerical representations of individual words that capture their meaning and usage patterns. Each word
|
||||
is represented as a vector in a high-dimensional space, where the dimensions correspond to different features or
|
||||
attributes of the word, such as its context, syntactic role, and semantic properties.
|
||||
|
||||
## Vectors in the context of Large Language Models
|
||||
|
||||
LLMs are trained on public datasets and until a certain point in time. They can't be used to answer questions on your
|
||||
new organizational data.
|
||||
|
||||
For instance, at the moment, the last training date for OpenAI's `text-davinci-003` model was June 2021 and so it has no
|
||||
idea about events in 2023.
|
||||
|
||||
However, we can steer an LLM to answer queries relevant to our new data by providing the context in which it should
|
||||
answer the question. This is typically done by providing additional information that it can use. But, given the
|
||||
limitation on the input size we have to pick the right data to feed to it as context.
|
||||
|
||||
We do this by taking all the textual data and chunking it, which is an important strategy for your LLM application.
|
||||
|
||||
Now when an input user query comes in, we search for chunks that have similar context and feed to our LLMs.
|
||||
|
||||
## Why do we need vector databases?
|
||||
|
||||
Vector databases are optimized to search for similar vectors.
|
||||
|
||||
Chunking and then searching for relevant chunks can't be done at query time for large systems with a large amount of
|
||||
text.
|
||||
|
||||
We first chunk and then store all chunked vectors in vector databases so that we can find relevant chunks using semantic
|
||||
search at the time of query.
|
||||
|
||||
## How does Hasura work with vector databases?
|
||||
|
||||
Hasura connects with vector databases just the same way as you would any other relational database. You can quickly and
|
||||
easily deploy a custom data connector agent to connect to your vector database.
|
||||
|
||||
## Next steps
|
||||
|
||||
- [Connect to a Weaviate vector database](/databases/vector-databases/weaviate.mdx)
|
||||
- [Learn more about Hasura Data Connectors](/databases/data-connectors/index.mdx)
|
||||
- [Deploy a custom data connector agent to Hasura Cloud](/hasura-cli/connector-plugin/index.mdx)
|
178
docs/docs/databases/vector-databases/weaviate.mdx
Normal file
@ -0,0 +1,178 @@
|
||||
---
|
||||
sidebar_label: Weaviate
|
||||
keywords:
|
||||
- hasura
|
||||
- docs
|
||||
- databases
|
||||
- vector databases
|
||||
- ai
|
||||
- machine learning
|
||||
- weaviate
|
||||
sidebar_position: 2
|
||||
---
|
||||
|
||||
import Thumbnail from '@site/src/components/Thumbnail';
|
||||
|
||||
# Connect Hasura to Weaviate
|
||||
|
||||
[Weaviate](https://weaviate.io/) is a cloud-native, modular, real-time vector search engine that allows you to build
|
||||
intelligent applications by using machine learning models as the data layer. It is open-source and can be deployed
|
||||
on-premise or in the cloud.
|
||||
|
||||
:::info Connecting vector databases to Hasura
|
||||
|
||||
To connect a vector database to Hasura, you'll need to take advantage of
|
||||
[Hasura Data Connectors](/databases/data-connectors/index.mdx). You can deploy any custom data connector agent to Hasura
|
||||
Cloud using our CLI plugin. For more information, refer to the [docs](/hasura-cli/connector-plugin/index.mdx).
|
||||
|
||||
If you're curious what other connectors are available, check out our [NDC Hub](https://github.com/hasura/ndc-hub).
|
||||
|
||||
:::
|
||||
|
||||
## Step 1: Deploy a data connector agent
|
||||
|
||||
We'll use the Hasura CLI to deploy a custom data connector agent to Hasura Cloud. Below, we're using the `create`
|
||||
command and naming our connector `weaviate-connector:v1`. We're also passing in the GitHub repo URL for the connector
|
||||
agent using the `--github-repo-url` flag:
|
||||
|
||||
```bash
|
||||
hasura connector create weaviate-connector:v1 --github-repo-url https://github.com/hasura/weaviate_gdc/tree/main
|
||||
```
|
||||
|
||||
We can check on the progress of the deployment using the `status` command:
|
||||
|
||||
```bash
|
||||
hasura connector status weaviate-connector:v1
|
||||
```
|
||||
|
||||
Once the `DONE` status is returned, we can grab the URL for our data connector agent using the `list` command:
|
||||
|
||||
```bash
|
||||
hasura connector list
|
||||
```
|
||||
|
||||
This will return a list of all the custom data connector agents you own. **The second value returned is the URL which
|
||||
we'll use in the next step; copy it to your clipboard.**
|
||||
|
||||
## Step 2: Add the data connector agent to your Hasura Cloud project
|
||||
|
||||
In your Cloud project, navigate to the `Data` tab and click `Manage` in the left-hand sidebar.
|
||||
|
||||
At the bottom of the screen, you'll see an expandable section titled `Data Connector Agents`.
|
||||
|
||||
<Thumbnail
|
||||
src="/img/databases/vector-dbs/weaviate/weaviate_add-agent.png"
|
||||
alt="Add the agent for a Weaviate database"
|
||||
width="1000px"
|
||||
/>
|
||||
|
||||
Click this and scroll down to `Add Agent`.
|
||||
|
||||
Name this agent `weaviate` and paste the URL you copied from the CLI into the `URL` field and click `Connect`.
|
||||
|
||||
<Thumbnail
|
||||
src="/img/databases/vector-dbs/weaviate/weaviate_configure-agent.png"
|
||||
alt="Add the agent for a Weaviate database"
|
||||
width="1000px"
|
||||
/>
|
||||
|
||||
## Step 3: Select the driver
|
||||
|
||||
Navigate to the `Data` tab and select `Connect Database`, then select `Weaviate` from the list of drivers:
|
||||
|
||||
<Thumbnail
|
||||
src="/img/databases/vector-dbs/weaviate/weaviate_connect-db.png"
|
||||
alt="Configure the Weaviate agent"
|
||||
width="1000px"
|
||||
/>
|
||||
|
||||
## Step 4: Connect your database
|
||||
|
||||
At this point, we'll need to configure a few parameters:
|
||||
|
||||
<Thumbnail
|
||||
src="/img/databases/vector-dbs/weaviate/connect-weaveate-database.png"
|
||||
alt="Connect Weaviate database"
|
||||
width="1000px"
|
||||
/>
|
||||
|
||||
| Parameter | Description |
|
||||
| ------------- | ------------------------------------------------------- |
|
||||
| Database Name | The name of your Weaviate database. |
|
||||
| `apiKey` | The API key for your Weaviate database. |
|
||||
| `host` | The URL of your Weaviate database. |
|
||||
| `openAPIKey` | The OpenAI key for use with your Weaviate database. |
|
||||
| `scheme` | The URL scheme for your Weaviate database (http/https). |
|
||||
|
||||
:::info Where can I find these parameters?
|
||||
|
||||
For the Weaviate-specific parameters, on the
|
||||
[Weaviate Cloud Services' Console](https://console.weaviate.cloud/dashboard), you can see your cluster's connection
|
||||
information on the cluster's card.
|
||||
|
||||
You can register for an OpenAI key [here](https://openai.com/blog/openai-api).
|
||||
|
||||
:::
|
||||
|
||||
## Step 5: Track your tables
|
||||
|
||||
To make schemas accessible for querying using GraphQL, we'll need to track them. In the example below, we're tracking a
|
||||
schema called `Resume` by checking the box next to it and clicking `Track Selected`:
|
||||
|
||||
<Thumbnail src="/img/databases/vector-dbs/weaviate/track-tables.png" alt="Connect Weaviate database" width="1000px" />
|
||||
|
||||
Tracking this schema will generate a type available in your GraphQL API that you can query against 🎉
|
||||
|
||||
:::info Don't have any tables to track?
|
||||
|
||||
You will need to define the schema in your vector database. For a walkthrough of setting up a Weaviate schema, refer to
|
||||
this [tutorial](https://weaviate.io/developers/weaviate/configuration/schema-configuration).
|
||||
|
||||
:::
|
||||
|
||||
## Step 6: Define a remote relationship
|
||||
|
||||
The information stored in Weaviate is vectorized and not in a human-readable format. We want to be able to return the
|
||||
information from our relational database using the vectorized data from Weaviate. To do this, we need to define a remote
|
||||
relationship.
|
||||
|
||||
In the example below, we're defining a remote relationship between the `Resume` schema in our vector database and the
|
||||
`application` table in our relational database. This way, whenever we query the vectorized information in our `Resume`
|
||||
table, we can return the information from our relational database.
|
||||
|
||||
<Thumbnail
|
||||
src="/img/databases/vector-dbs/weaviate/define-remote-relationship.png"
|
||||
alt="Define remote relationship"
|
||||
width="1000px"
|
||||
/>
|
||||
|
||||
## Step 7: Query your data
|
||||
|
||||
You can now query across both your vector database and your existing relational database tables as if they were in one
|
||||
location!
|
||||
|
||||
In our example, we have two tables in our relational database:
|
||||
|
||||
1. `candidate`
|
||||
|
||||
<Thumbnail src="/img/databases/vector-dbs/weaviate/candidate.png" alt="Candidate 1 table" width="425px" />
|
||||
|
||||
2. `application`
|
||||
|
||||
<Thumbnail src="/img/databases/vector-dbs/weaviate/application.png" alt="Application 2 table" width="700px" />
|
||||
|
||||
Our vector database stores the resumes as:
|
||||
|
||||
<Thumbnail src="/img/databases/vector-dbs/weaviate/resume-store.png" alt="Resume store" width="1000px" />
|
||||
|
||||
If we head to the `API` tab in the Hasura Console, in our GraphQL query, we are able to fetch all the candidate and
|
||||
application information for a resume. Hasura brings this all together to provide this seamless querying experience.
|
||||
|
||||
<Thumbnail src="/img/databases/vector-dbs/weaviate/execute-query.png" alt="Execute query" width="1000px" />
|
||||
|
||||
## Next Steps
|
||||
|
||||
- Check out our [Learn tutorial](https://hasura.io/learn/graphql/vectordbs/introduction/) on Generative AI using Hasura,
|
||||
Weaviate, Next.js and Tailwind CSS 🎉
|
||||
- Learn more about [Hasura Data Connectors](/databases/data-connectors/index.mdx).
|
||||
- Check out the available connectors on the [NDC Hub](https://github.com/hasura/ndc-hub)... or build your own!
|
@ -1,129 +0,0 @@
|
||||
---
|
||||
sidebar_label: Vector Databases
|
||||
keywords:
|
||||
- hasura
|
||||
- docs
|
||||
- databases
|
||||
- vector databases
|
||||
- ai
|
||||
- machine learning
|
||||
sidebar_position: 1.45
|
||||
---
|
||||
|
||||
import Thumbnail from '@site/src/components/Thumbnail';
|
||||
|
||||
# How Does Hasura Work With Vector Databases?
|
||||
|
||||
## What are vectors?
|
||||
|
||||
Vectors are mathematical representations for unstructured data like text, audio, or video data. Vectors generated from
|
||||
deep neural network models like large language models (LLMs) are of a high-dimension to capture multiple latent
|
||||
features, which can be then used to classify text, or cluster related text.
|
||||
|
||||
Word vectors are numerical representations of individual words that capture their meaning and usage patterns. Each word
|
||||
is represented as a vector in a high-dimensional space, where the dimensions correspond to different features or
|
||||
attributes of the word, such as its context, syntactic role, and semantic properties.
|
||||
|
||||
## Vectors in the context of Large Language Models
|
||||
|
||||
LLMs are trained on public datasets and until a certain point in time. They can’t be used to answer
|
||||
questions on your new organizational data.
|
||||
|
||||
For instance, at the moment, the last training date for OpenAI's `text-davinci-003` model was June 2021 and so it has
|
||||
no idea about events in 2023.
|
||||
|
||||
However, we can steer an LLM to answer queries relevant to our new data by providing the context in which it should
|
||||
answer the question. This is typically done by providing additional information that it can use. But, given the
|
||||
limitation on the input size we have to pick the right data to feed to it as context.
|
||||
|
||||
We do this by taking all the textual data and chunking it, which is an important strategy for your LLM application.
|
||||
|
||||
Now when an input user query comes in, we search for chunks that have similar context and feed to our LLMs.
|
||||
|
||||
## Why do we need vector databases?
|
||||
|
||||
Vector databases are optimized to search for similar vectors.
|
||||
|
||||
Chunking and then searching for relevant chunks can’t be done at query time for large systems with a large amount of
|
||||
text.
|
||||
|
||||
We first chunk and then store all chunked vectors in vector databases so that we can find relevant chunks using semantic
|
||||
search at the time of query.
|
||||
|
||||
## How does Hasura work with a vector databases?
|
||||
|
||||
Hasura connects with vector databases just the same way as you would any other relational database. Supported vector
|
||||
databases will be available for you to integrate in the `Connect Database` section in Console.
|
||||
|
||||
To demo these features please [check out our blog post](https://hasura.io/blog/hasura-brings-the-power-of-generative-ai-to-your-data/)
|
||||
on how to set it up with Weaviate.
|
||||
|
||||
### Step 1: Select the driver
|
||||
|
||||
In our case the driver is Weaviate:
|
||||
|
||||
<Thumbnail src="/img/databases/vector-dbs/vector-db-connect-db.png" alt="Add database source" width="700px" />
|
||||
|
||||
### Step 2: Connect your database
|
||||
|
||||
At this step you have to configure few parameters such as:
|
||||
|
||||
- API to access your vector db
|
||||
- Host of your vector db
|
||||
- URL scheme: http/https
|
||||
- The model you would like to use for factorizing text. In this demo example, we only support OpenAI. Hence it
|
||||
requests you to provide OpenAI key.
|
||||
|
||||
<Thumbnail src="/img/databases/vector-dbs/connect-weaveate-database.png" alt="Connect Weaviate database" width="1000px" />
|
||||
|
||||
### Step 3: Review your database in the `Data Manager` tab
|
||||
|
||||
<Thumbnail src="/img/databases/vector-dbs/data-manager-tab.png" alt="Data manager tab" width="1000px" />
|
||||
|
||||
### Step 4: Create your vector database table (schema)
|
||||
|
||||
You will need to define the schema in your vector database. For a walkthrough of setting up a Weaviate schema, refer
|
||||
to the [tutorial](https://weaviate.io/developers/weaviate/configuration/schema-configuration).
|
||||
|
||||
### Step 5: Track your tables
|
||||
|
||||
In order (for schemas) to be accessible for querying using Graph QL you will need to track them.
|
||||
|
||||
<Thumbnail src="/img/databases/vector-dbs/track-tables.png" alt="Connect Weaviate database" width="1000px" />
|
||||
|
||||
|
||||
### Step 6: Define the remote relationship
|
||||
|
||||
Define the remote relationship from your vector database to your relational database
|
||||
|
||||
<Thumbnail src="/img/databases/vector-dbs/define-remote-relationship.png" alt="Define remote relationship"
|
||||
width="1000px" />
|
||||
|
||||
|
||||
### Step 7: Go nuts! Query query query!
|
||||
|
||||
You can now query across both your vector database and your existing relational database tables as if they were in
|
||||
one location.
|
||||
|
||||
We have 2 tables in our relational database:
|
||||
|
||||
1. Candidate 1
|
||||
|
||||
<Thumbnail src="/img/databases/vector-dbs/candidate.png" alt="Candidate 1 table" width="425px" />
|
||||
|
||||
2. Application 2
|
||||
|
||||
<Thumbnail src="/img/databases/vector-dbs/application.png" alt="Application 2 table" width="700px" />
|
||||
|
||||
Our vector database stores the resume as:
|
||||
|
||||
<Thumbnail src="/img/databases/vector-dbs/resume-store.png" alt="Resume store" width="1000px" />
|
||||
|
||||
In our GraphQL query we are able to fetch all the candidate and application information for a resume. Hasura brings
|
||||
them all together to provide this seamless querying experience.
|
||||
|
||||
<Thumbnail src="/img/databases/vector-dbs/execute-query.png" alt="Execute query" width="1000px" />
|
||||
|
||||
|
||||
|
||||
|
Before Width: | Height: | Size: 65 KiB |
Before Width: | Height: | Size: 23 KiB After Width: | Height: | Size: 23 KiB |
Before Width: | Height: | Size: 16 KiB After Width: | Height: | Size: 16 KiB |
Before Width: | Height: | Size: 71 KiB After Width: | Height: | Size: 71 KiB |
Before Width: | Height: | Size: 68 KiB After Width: | Height: | Size: 68 KiB |
Before Width: | Height: | Size: 126 KiB After Width: | Height: | Size: 126 KiB |
Before Width: | Height: | Size: 178 KiB After Width: | Height: | Size: 178 KiB |
Before Width: | Height: | Size: 65 KiB After Width: | Height: | Size: 65 KiB |
Before Width: | Height: | Size: 83 KiB After Width: | Height: | Size: 83 KiB |
BIN
docs/static/img/databases/vector-dbs/weaviate/weaviate_add-agent.png
vendored
Normal file
After Width: | Height: | Size: 66 KiB |
BIN
docs/static/img/databases/vector-dbs/weaviate/weaviate_configure-agent.png
vendored
Normal file
After Width: | Height: | Size: 65 KiB |
BIN
docs/static/img/databases/vector-dbs/weaviate/weaviate_connect-db.png
vendored
Normal file
After Width: | Height: | Size: 61 KiB |