|
6 | 6 | "source": [
|
7 | 7 | "# Google Cloud SQL for MySQL\n",
|
8 | 8 | "\n",
|
9 |
| - "> [Cloud SQL](https://cloud.google.com/sql) is a fully managed relational database service that offers high performance, seamless integration, and impressive scalability. It offers [MySQL](https://cloud.google.com/sql/mysql), [PostgreSQL](https://cloud.google.com/sql/postgres), and [SQL Server](https://cloud.google.com/sql/sqlserver) database engines. Extend your database application to build AI-powered experiences leveraging Cloud SQL's Langchain integrations.\n", |
| 9 | + "> [Cloud SQL](https://cloud.google.com/sql) is a fully managed relational database service that offers high performance, seamless integration, and impressive scalability. It offers [MySQL](https://cloud.google.com/sql/mysql), [PostgreSQL](https://cloud.google.com/sql/postgresql), and [SQL Server](https://cloud.google.com/sql/sqlserver) database engines. Extend your database application to build AI-powered experiences leveraging Cloud SQL's Langchain integrations.\n", |
10 | 10 | "\n",
|
11 | 11 | "This notebook goes over how to use [Cloud SQL for MySQL](https://cloud.google.com/sql/mysql) to [save, load and delete langchain documents](https://python.langchain.com/docs/modules/data_connection/document_loaders/) with `MySQLLoader` and `MySQLDocumentSaver`.\n",
|
12 | 12 | "\n",
|
| 13 | + "Learn more about the package on [GitHub](https://github.com/googleapis/langchain-google-cloud-sql-mysql-python/).\n", |
| 14 | + "\n", |
13 | 15 | "[](https://colab.research.google.com/github/googleapis/langchain-google-cloud-sql-mysql-python/blob/main/docs/document_loader.ipynb)"
|
14 | 16 | ]
|
15 | 17 | },
|
|
20 | 22 | "## Before You Begin\n",
|
21 | 23 | "\n",
|
22 | 24 | "To run this notebook, you will need to do the following:\n",
|
| 25 | + "\n", |
23 | 26 | "* [Create a Google Cloud Project](https://developers.google.com/workspace/guides/create-project)\n",
|
| 27 | + "* [Enable the Cloud SQL Admin API.](https://console.cloud.google.com/marketplace/product/google/sqladmin.googleapis.com)\n", |
24 | 28 | "* [Create a Cloud SQL for MySQL instance](https://cloud.google.com/sql/docs/mysql/create-instance)\n",
|
25 | 29 | "* [Create a Cloud SQL database](https://cloud.google.com/sql/docs/mysql/create-manage-databases)\n",
|
26 | 30 | "* [Add an IAM database user to the database](https://cloud.google.com/sql/docs/mysql/add-manage-iam-users#creating-a-database-user) (Optional)\n",
|
|
136 | 140 | "auth.authenticate_user()"
|
137 | 141 | ]
|
138 | 142 | },
|
139 |
| - { |
140 |
| - "cell_type": "markdown", |
141 |
| - "metadata": {}, |
142 |
| - "source": [ |
143 |
| - "### API Enablement\n", |
144 |
| - "The `langchain-google-cloud-sql-mysql` package requires that you [enable the Cloud SQL Admin API](https://console.cloud.google.com/flows/enableapi?apiid=sqladmin.googleapis.com) in your Google Cloud Project." |
145 |
| - ] |
146 |
| - }, |
147 |
| - { |
148 |
| - "cell_type": "code", |
149 |
| - "execution_count": null, |
150 |
| - "metadata": {}, |
151 |
| - "outputs": [], |
152 |
| - "source": [ |
153 |
| - "# enable Cloud SQL Admin API\n", |
154 |
| - "!gcloud services enable sqladmin.googleapis.com" |
155 |
| - ] |
156 |
| - }, |
157 | 143 | {
|
158 | 144 | "cell_type": "markdown",
|
159 | 145 | "metadata": {},
|
|
179 | 165 | "By default, [IAM database authentication](https://cloud.google.com/sql/docs/mysql/iam-authentication#iam-db-auth) will be used as the method of database authentication. This library uses the IAM principal belonging to the [Application Default Credentials (ADC)](https://cloud.google.com/docs/authentication/application-default-credentials) sourced from the envionment.\n",
|
180 | 166 | "\n",
|
181 | 167 | "For more informatin on IAM database authentication please see:\n",
|
| 168 | + "\n", |
182 | 169 | "* [Configure an instance for IAM database authentication](https://cloud.google.com/sql/docs/mysql/create-edit-iam-instances)\n",
|
183 | 170 | "* [Manage users with IAM database authentication](https://cloud.google.com/sql/docs/mysql/add-manage-iam-users)\n",
|
184 | 171 | "\n",
|
185 | 172 | "Optionally, [built-in database authentication](https://cloud.google.com/sql/docs/mysql/built-in-authentication) using a username and password to access the Cloud SQL database can also be used. Just provide the optional `user` and `password` arguments to `MySQLEngine.from_instance()`:\n",
|
| 173 | + "\n", |
186 | 174 | "* `user` : Database user to use for built-in database authentication and login\n",
|
187 | 175 | "* `password` : Database password to use for built-in database authentication and login."
|
188 | 176 | ]
|
|
207 | 195 | "### Initialize a table\n",
|
208 | 196 | "\n",
|
209 | 197 | "Initialize a table of default schema via `MySQLEngine.init_document_table(<table_name>)`. Table Columns:\n",
|
| 198 | + "\n", |
210 | 199 | "- page_content (type: text)\n",
|
211 | 200 | "- langchain_metadata (type: JSON)\n",
|
212 | 201 | "\n",
|
|
229 | 218 | "### Save documents\n",
|
230 | 219 | "\n",
|
231 | 220 | "Save langchain documents with `MySQLDocumentSaver.add_documents(<documents>)`. To initialize `MySQLDocumentSaver` class you need to provide 2 things:\n",
|
| 221 | + "\n", |
232 | 222 | "1. `engine` - An instance of a `MySQLEngine` engine.\n",
|
233 | 223 | "2. `table_name` - The name of the table within the Cloud SQL database to store langchain documents."
|
234 | 224 | ]
|
|
241 | 231 | },
|
242 | 232 | "outputs": [],
|
243 | 233 | "source": [
|
244 |
| - "from langchain_google_cloud_sql_mysql import MySQLDocumentSaver\n", |
245 | 234 | "from langchain_core.documents import Document\n",
|
| 235 | + "from langchain_google_cloud_sql_mysql import MySQLDocumentSaver\n", |
246 | 236 | "\n",
|
247 | 237 | "test_docs = [\n",
|
248 | 238 | " Document(\n",
|
|
274 | 264 | "metadata": {},
|
275 | 265 | "source": [
|
276 | 266 | "Load langchain documents with `MySQLLoader.load()` or `MySQLLoader.lazy_load()`. `lazy_load` returns a generator that only queries database during the iteration. To initialize `MySQLLoader` class you need to provide:\n",
|
| 267 | + "\n", |
277 | 268 | "1. `engine` - An instance of a `MySQLEngine` engine.\n",
|
278 | 269 | "2. `table_name` - The name of the table within the Cloud SQL database to store langchain documents."
|
279 | 270 | ]
|
|
345 | 336 | "For table with default schema (page_content, langchain_metadata), the deletion criteria is:\n",
|
346 | 337 | "\n",
|
347 | 338 | "A `row` should be deleted if there exists a `document` in the list, such that\n",
|
| 339 | + "\n", |
348 | 340 | "- `document.page_content` equals `row[page_content]`\n",
|
349 | 341 | "- `document.metadata` equals `row[langchain_metadata]`"
|
350 | 342 | ]
|
|
402 | 394 | " CREATE TABLE IF NOT EXISTS `{TABLE_NAME}`(\n",
|
403 | 395 | " fruit_id INT AUTO_INCREMENT PRIMARY KEY,\n",
|
404 | 396 | " fruit_name VARCHAR(100) NOT NULL,\n",
|
405 |
| - " variety VARCHAR(50), \n", |
| 397 | + " variety VARCHAR(50),\n", |
406 | 398 | " quantity_in_stock INT NOT NULL,\n",
|
407 | 399 | " price_per_unit DECIMAL(6,2) NOT NULL,\n",
|
408 | 400 | " organic TINYINT(1) NOT NULL\n",
|
|
449 | 441 | "metadata": {},
|
450 | 442 | "source": [
|
451 | 443 | "We can specify the content and metadata we want to load by setting the `content_columns` and `metadata_columns` when initializing the `MySQLLoader`.\n",
|
| 444 | + "\n", |
452 | 445 | "1. `content_columns`: The columns to write into the `page_content` of the document.\n",
|
453 | 446 | "2. `metadata_columns`: The columns to write into the `metadata` of the document.\n",
|
454 | 447 | "\n",
|
|
487 | 480 | "metadata": {},
|
488 | 481 | "source": [
|
489 | 482 | "In order to save langchain document into table with customized metadata fields. We need first create such a table via `MySQLEngine.init_document_table()`, and specify the list of `metadata_columns` we want it to have. In this example, the created table will have table columns:\n",
|
| 483 | + "\n", |
490 | 484 | "- description (type: text): for storing fruit description.\n",
|
491 | 485 | "- fruit_name (type text): for storing fruit name.\n",
|
492 | 486 | "- organic (type tinyint(1)): to tell if the fruit is organic.\n",
|
493 | 487 | "- other_metadata (type: JSON): for storing other metadata information of the fruit.\n",
|
494 | 488 | "\n",
|
495 | 489 | "We can use the following parameters with `MySQLEngine.init_document_table()` to create the table:\n",
|
| 490 | + "\n", |
496 | 491 | "1. `table_name`: The name of the table within the Cloud SQL database to store langchain documents.\n",
|
497 | 492 | "2. `metadata_columns`: A list of `sqlalchemy.Column` indicating the list of metadata columns we need.\n",
|
498 | 493 | "3. `content_column`: The name of column to store `page_content` of langchain document. Default: `page_content`.\n",
|
|
532 | 527 | "metadata": {},
|
533 | 528 | "source": [
|
534 | 529 | "Save documents with `MySQLDocumentSaver.add_documents(<documents>)`. As you can see in this example, \n",
|
| 530 | + "\n", |
535 | 531 | "- `document.page_content` will be saved into `description` column.\n",
|
536 | 532 | "- `document.metadata.fruit_name` will be saved into `fruit_name` column.\n",
|
537 | 533 | "- `document.metadata.organic` will be saved into `organic` column.\n",
|
|
585 | 581 | "We can also delete documents from table with customized metadata columns via `MySQLDocumentSaver.delete(<documents>)`. The deletion criteria is:\n",
|
586 | 582 | "\n",
|
587 | 583 | "A `row` should be deleted if there exists a `document` in the list, such that\n",
|
| 584 | + "\n", |
588 | 585 | "- `document.page_content` equals `row[page_content]`\n",
|
589 | 586 | "- For every metadata field `k` in `document.metadata`\n",
|
590 | 587 | " - `document.metadata[k]` equals `row[k]` or `document.metadata[k]` equals `row[langchain_metadata][k]`\n",
|
|
0 commit comments