How to quickly integrate PAI-RAG with RDS PostgreSQL - Platform For AI

This topic describes how to associate an RDS PostgreSQL instance when you deploy a RAG service. It also describes the basic features of the RAG chatbot and the special features of RDS PostgreSQL.

Background information

Introduction to EAS

Elastic Algorithm Service (EAS) is an online model service platform of PAI that allows you to deploy models as online inference services or AI-powered web applications. EAS provides features such as auto scaling and blue-green deployment. These features reduce the costs of developing stable online model services that can handle a large number of concurrent requests. In addition, EAS provides features such as resource group management and model versioning and capabilities such as comprehensive O&M and monitoring. For more information, see EAS overview.

Introduction to RAG

With the rapid development of AI technology, generative AI has made remarkable achievements in various fields such as text generation and image generation. However, the following inherent limits gradually emerge while LLMs are widely used:

Field knowledge limits: In most cases, LLMs are trained by using large-scale general datasets. In this case, LLMs struggle to provide in-depth and targeted processing for specialized vertical fields.
Information update delay: The static nature of the training datasets prevents LLMs from accessing and incorporating real-time information and knowledge updates.
Misleading outputs: LLMs are prone to hallucinations, producing outputs that appear plausible but are factually incorrect. This is attributed to factors such as data bias and inherent model limits.

To address these challenges and enhance the capabilities and accuracy of LLMs, RAG is developed. RAG integrates external knowledge bases to significantly mitigate the issue of LLM hallucinations and enhance the capabilities of LLMs to access and apply up-to-date knowledge. This enables the customization of LLMs for greater personalization and accuracy.

Introduction to RDS PostgreSQL

Alibaba Cloud Relational Database Service (RDS) supports the PostgreSQL engine. PostgreSQL's advantages include its complete implementation of SQL specifications and its support for a rich variety of data types, such as JSON, IP, and geometric data. In addition to fully supporting features such as transactions, subqueries, Multi-Version Concurrency Control (MVCC), and data integrity checks, RDS PostgreSQL also integrates important features such as high availability, backup, and recovery to reduce your O&M workload. For more information about the advanced features of RDS PostgreSQL, see RDS PostgreSQL.

Procedure

EAS provides a self-developed RAG systematic solution with flexible parameter configurations. You can access RAG services by using a web user interface (UI) or calling API operations to configure a custom RAG-based LLM chatbot. The technical architecture of RAG focuses on retrieval and generation.

Retrieval: EAS integrates a range of vector databases, including open source Faiss and Alibaba Cloud services such as Milvus, Elasticsearch, Hologres, OpenSearch, and AnalyticDB for PostgreSQL.
Generation: EAS supports various open source models such as Qwen, Meta Llama, Mistral, and Baichuan, and also integrates ChatGPT.

This topic uses RDS PostgreSQL as an example to describe how to use EAS and RDS PostgreSQL to build an LLM-based RAG chatbot. The procedure is as follows:

Prepare a vector database: RDS PostgreSQL
First, you must create an RDS PostgreSQL instance and prepare the configuration items required to associate the RAG service with the instance during deployment.
Deploy a RAG service and associate it with an RDS PostgreSQL instance
You can deploy a RAG service on the EAS platform and associate it with the RDS PostgreSQL instance.
Use the RAG chatbot
You can connect to RDS PostgreSQL in the RAG chatbot, upload enterprise knowledge base files, and perform knowledge-based Q&A.

Prerequisites

A virtual private cloud (VPC), a vSwitch, and a security group are created. For more information, see Create a VPC with an IPv4 CIDR block and Create a security group.

Precautions

This practice is subject to the maximum number of tokens of an LLM service and is designed to help you understand the basic retrieval feature of a RAG-based LLM chatbot:

Prepare a vector database: RDS PostgreSQL

Step 1: Create an RDS PostgreSQL instance and a database

Create an RDS PostgreSQL instance.
1. Click here to go to the RDS instance creation page.
2. On the buy page, configure the following key parameters. For information about other parameters, see Create an RDS PostgreSQL instance.
  - Engine: Select PostgreSQL.
  - VPC: Select the VPC that you created.
  - Privileged Account: In the More Configurations section, configure a privileged account. Select Set Now and configure the database account and password.
3. Follow the instructions in the console to complete the payment and activation.
Create a database.
1. Click the name of the instance that you created. In the navigation pane on the left, click Database Management and then click Create Database.
2. In the Create Database panel, configure the Database (DB) Name. For Authorized Account, select the privileged account that you created. For more information about other parameters, see Create accounts and databases.
3. After you configure the parameters, click Create.

Step 2: Prepare configuration items

Retrieve the database endpoint.
In the navigation pane on the left of the RDS PostgreSQL instance details page, choose Database Connection to view the internal endpoint, public endpoint, and the corresponding port number of the database.
- Use an internal endpoint: The RAG service must be in the same VPC as the database.
- Use a public endpoint: If EAS accesses an RDS PostgreSQL instance over the Internet, the EAS service must have Internet access. To ensure that the PostgreSQL instance can receive public requests from the EAS instance, you must request a public endpoint for the PostgreSQL instance and add the related Elastic IP Address (EIP) or 0.0.0.0/0 to the whitelist. The procedure is as follows:
  1. Request a public endpoint for the RDS PostgreSQL instance. For more information, see Enable or disable a public endpoint.
  2. To enable Internet access for EAS, you must attach a NAT Gateway and an EIP to the VPC that you use when you deploy the RAG service. For more information, see Use the SNAT feature of an Internet NAT gateway to access the Internet.
    Note
    The RAG service can use the same VPC as the RDS PostgreSQL instance or a different VPC.
  3. Add 0.0.0.0/0 or the EIP to the public endpoint whitelist of the RDS PostgreSQL instance. For more information, see Configure a whitelist.
Retrieve the privileged account and password.
In the navigation pane on the left of the RDS PostgreSQL instance details page, choose Account Management. On this page, you can view the privileged account that you created. The corresponding password was set when you created the instance. If you forget the password, you can click Reset Password to change it.

Deploy a RAG service and associate it with an RDS PostgreSQL instance

Log on to the PAI console, select a workspace, and then click Enter EAS.
On the Model Online Service (EAS) page, click Deploy Service. In the Scenario-based Model Deployment area, click RAG-based Smart Dialogue Deployment.

On the Deploy RAG Chatbot For LLM page, configure the following key parameters. For information about other parameters, see Step 1: Deploy a RAG service.

Parameter		Description
Basic Information	Version	Select Integrated LLM Deployment.
	RAG Version	Select pai-rag:0.3.4.
	Model Type	Select qwen1.5-1.8b.
Resource Information	Deployment Resources	The system automatically recommends suitable resource specifications based on the selected model type. If you use other resource specifications, the model service may fail to start.
Vector Database Settings	Version Type	Select RDS PostgreSQL.
	Host	Set this parameter to the internal or public endpoint of the RDS PostgreSQL instance.
	Port	Set this parameter to the port number of the RDS PostgreSQL instance, for example, 5432.
	Database	Enter the name of the created database.
	Table Name	Enter a new table name or an existing table name. For an existing table, the table schema must meet the PAI-RAG requirements. For example, you can enter the name of a table that was automatically created when you deployed a RAG service using EAS.
	Account	Enter the created privileged account.
	Password	Enter the password that corresponds to the privileged account.
	OSS Path	Select an existing OSS storage directory in the current region. The knowledge base is managed by mounting the OSS path.
VPC	VPC	If you use an internal endpoint for the host, the RAG service must be configured in the same VPC as the RDS PostgreSQL instance. If you use a public endpoint for the host, you must configure a VPC for the RAG service. You need to ensure that the VPC has Internet access. For more information, see Use the SNAT feature of an Internet NAT gateway to access the Internet. You also need to add the attached EIP or `0.0.0.0/0` to the public access whitelist of the RDS PostgreSQL instance. For more information, see Configure a whitelist.
	VSwitch
	Security Group Name

After you configure the parameters, click Deploy.

Use the RAG chatbot

1. Check the vector database configuration

The following section describes how to use a RAG-based LLM chatbot. For more information, see Deploy a RAG-based LLM chatbot.

Click the name of the target RAG service, and then click View Web App in the upper-right corner of the page.
Check whether the PostgreSQL vector database is correctly configured.
The system automatically configures the default knowledge base and applies the vector database settings that you configured when you deployed the RAG service. In the Vector Database Configuration section, check whether the PostgreSQL configuration is correct. You can modify the configuration items to the correct values and then click Update Knowledge Base.

2. Upload enterprise knowledge base files

On the File Management tab of the Knowledge Base tab, upload knowledge base files.

After the knowledge base is uploaded, the system automatically stores the files in the vector database in the PAI-RAG format. For knowledge base files with the same name, vector databases other than FAISS overwrite the original files. Supported file types include .html, .htm, .txt, .pdf, .pptx, .md, Excel (.xlsx or .xls), .jsonl, .jpeg, .jpg, .png, .csv, and Word (.docx), for example, rag_chatbot_test_doc.txt.

3. Perform knowledge-based Q&A

On the Chat tab, select a knowledge base name and an intent. Select Query Knowledge Base to use more tools. Then, you can perform knowledge-based Q&A.

Support for special features of RDS PostgreSQL

Go to the RDS instance list, switch to the region where the instance is located, and then click the instance name to go to the instance details page.
In the navigation pane on the left, choose Database Management, and then click SQL Query in the Actions column of the target database.
Enter the Database Account and Database Password, which are the privileged account and password that you set when you created the RDS PostgreSQL instance, and then click Logon.
After you log on, you can query the list of imported knowledge bases in the database instance.

References

EAS provides simplified deployment methods for typical cutting-edge scenarios of AI-Generated Content (AIGC) and LLM. You can easily deploy model services by using deployment methods such as ComfyUI, Stable Diffusion WebUI, ModelScope, Hugging Face, Triton Inference Server, and TensorFlow Serving. For more information, see Scenario-based deployment.
You can configure various inference parameters on the web UI of a RAG-based LLM chatbot to meet diverse requirements. You can also use the RAG-based LLM chatbot by calling API operations. For more information about implementation details and parameter settings, see Deploy a RAG-based LLM chatbot.
A RAG-based LLM chatbot can also be associated with other vector databases, such as OpenSearch, or Elasticsearch. For more information, see Use EAS and Elasticsearch to deploy a RAG-based LLM chatbot, or Use EAS and OpenSearch to deploy a RAG-based LLM chatbot.