The AI.GENERATE_INT function
This document describes the AI.GENERATE_INT
function, which lets you
analyze any combination of text and unstructured data
from BigQuery
standard tables. For each row
in the table, the function generates a STRUCT
that contains an INT64
value.
The function works by sending requests to a Vertex AI Gemini model, and then returning that model's response.
You can use the AI.GENERATE_INT
function to perform tasks such as
classification and sentiment analysis.
Prompt design can strongly affect the responses returned by the model. For more information, see Introduction to prompting.
Input
Using the AI.GENERATE_INT
function, you can use the following types
of input:
- Text data from standard tables.
ObjectRefRuntime
values that are generated by theOBJ.GET_ACCESS_URL
function. You can useObjectRef
values from standard tables as input to theOBJ.GET_ACCESS_URL
function. (Preview)
When you analyze unstructured data, that data must meet the following requirements:
- Content must be in one of the supported formats that are
described in the Gemini API model
mimeType
parameter. - If you are analyzing a video, the maximum supported length is two minutes.
If the video is longer than two minutes,
AI.GENERATE_INT
only returns results based on the first two minutes.
Syntax
AI.GENERATE_INT( [ prompt => ] 'prompt', connection_id => 'connection' [, endpoint => 'endpoint'] [, model_params => model_params] )
Arguments
AI.GENERATE_INT
takes the following arguments:
prompt
: aSTRING
orSTRUCT
value that specifies the prompt to send to the model. The prompt must be the first argument that you specify. You can provide the prompt value in the following ways:- Specify a
STRING
value. For example,('Write a poem about birds')
. Specify a
STRUCT
value that contains one or more fields. You can use the following types of fields within theSTRUCT
value:Field type Description Examples STRING
A string literal, or the name of a STRING
column.String literal: 'Is Seattle a US city?'
String column name:my_string_column
ARRAY<STRING>
You can only use string literals in the array. Array of string literals: ['Is ', 'Seattle', ' a US city']
ObjectRefRuntime
An
ObjectRefRuntime
value returned by theOBJ.GET_ACCESS_URL
function. TheOBJ.GET_ACCESS_URL
function takes anObjectRef
value as input, which you can provide by either specifying the name of a column that containsObjectRef
values, or by constructing anObjectRef
value.ObjectRefRuntime
values must have theaccess_url.read_url
anddetails.gcs_metadata.content_type
elements of the JSON value populated.Function call with ObjectRef
column:OBJ.GET_ACCESS_URL(my_objectref_column, 'r')
Function call with constructedObjectRef
value:OBJ.GET_ACCESS_URL(OBJ.MAKE_REF('gs://image.jpg', 'myconnection'), 'r')
ARRAY<ObjectRefRuntime>
ObjectRefRuntime
values returned from multiple calls to theOBJ.GET_ACCESS_URL
function. TheOBJ.GET_ACCESS_URL
function takes anObjectRef
value as input, which you can provide by either specifying the name of a column that containsObjectRef
values, or by constructing anObjectRef
value.ObjectRefRuntime
values must have theaccess_url.read_url
anddetails.gcs_metadata.content_type
elements of the JSON value populated.Function calls with ObjectRef
columns:[OBJ.GET_ACCESS_URL(my_objectref_column1, 'r'), OBJ.GET_ACCESS_URL(my_objectref_column2, 'r')]
Function calls with constructedObjectRef
values:[OBJ.GET_ACCESS_URL(OBJ.MAKE_REF('gs://image1.jpg', 'myconnection'), 'r'), OBJ.GET_ACCESS_URL(OBJ.MAKE_REF('gs://image2.jpg', 'myconnection'), 'r')]
The function combines
STRUCT
fields similarly to aCONCAT
operation and concatenates the fields in their specified order. The same is true for the elements of any arrays used within the struct. The following table shows some examples ofSTRUCT
prompt values and how they are interpreted:Struct field types Struct value Semantic equivalent STRUCT<STRING>
('Describe the city of Seattle')
'Describe the city of Seattle' STRUCT<STRING, STRING, STRING>
('Describe the city ', my_city_column, ' in 15 words')
'Describe the city my_city_column_value in 15 words' STRUCT<STRING, ARRAY<STRING>>
('Describe ', ['the city of', 'Seattle'])
'Describe the city of Seattle' STRUCT<STRING, ObjectRefRuntime>
('Describe this city', OBJ.GET_ACCESS_URL(image_objectref_column, 'r'))
'Describe this city' image STRUCT<STRING, ObjectRefRuntime, ObjectRefRuntime>
('If the city in the first image is within the country of the second image, provide a ten word description of the city',
OBJ.GET_ACCESS_URL(city_image_objectref_column, 'r'),
OBJ.GET_ACCESS_URL(country_image_objectref_column, 'r'))'If the city in the first image is within the country of the second image, provide a ten word description of the city' city_image country_image
- Specify a
connection_id
: aSTRING
value specifying the connection to use to communicate with the model, in the format[PROJECT_ID].[LOCATION].[CONNECTION_ID]
. For example,myproject.us.myconnection
.Replace the following:
PROJECT_ID
: the project ID of the project that contains the connection.LOCATION
: the location used by the connection. The connection must be in the same location as the dataset that contains the model.CONNECTION_ID
: the connection ID—for example,myconnection
.You can get this value by viewing the connection details in the Google Cloud console and copying the value in the last section of the fully qualified connection ID that is shown in Connection ID. For example,
projects/myproject/locations/connection_location/connections/myconnection
.
You need to grant the Vertex AI User role to the connection's service account in the project where you run the function.
endpoint
: aSTRING
value that specifies the Vertex AI endpoint to use for the model. Only Gemini models are supported. If you specify the model name, BigQuery ML automatically identifies and uses the full endpoint of the model. If you don't specify anendpoint
value, BigQuery ML selects a recent stable version of Gemini to use.model_params
: aJSON
literal that provides additional parameters to the model. Themodel_params
value must conform to thegenerateContent
request body format. You can provide a value for any field in the request body except for thecontents
field; thecontents
field is populated with theprompt
argument value.
Output
AI.GENERATE_INT
returns a STRUCT
value for each row in the table. The struct
contains the following fields:
result
: anINT64
value containing the model's response to the prompt. The result isNULL
if the request fails or is filtered by responsible AI.full_response
: aSTRING
value containing the JSON response from theprojects.locations.endpoints.generateContent
call to the model. The generated text is in thetext
element. The safety attributes are in thesafety_ratings
element.status
: aSTRING
value that contains the API response status for the corresponding row. This value is empty if the operation was successful.
Examples
The following examples demonstrate how to use the
AI.GENERATE_INT
function.
Use string input
Suppose you have the following table called mydataset.cities
with a single
city
column:
+---------+ | city | +---------+ | Seattle | | Beijing | | Paris | | London | +---------+
To determine the population of each city, you can call the
AI.GENERATE_INT
function and select the result
field in the output
by running the following query:
SELECT city, AI.GENERATE_INT(('What is the population of ', city), connection_id => 'us.test_connection', endpoint => 'gemini-2.0-flash').result FROM mydataset.cities;
The result is similar to the following:
+---------+----------+ | city | result | +---------+----------+ | Seattle | 771455 | | Beijing | 21893000 | | Paris | 2165423 | | London | 9787426 | +---------+----------+
Use ObjectRefRuntime
input
Suppose you have the following table called mydataset.animals
with a single
STRUCT
column that uses the ObjectRef
format and contains images of
animals:
+----------------------------+-----------------+--------------------+----------------------------------------------------------+ | animals.uri | animals.version | animals.authorizer | animals.details | +----------------------------+-----------------+--------------------+----------------------------------------------------------+ | gs://mybucket/snake.jpeg | 12345678 | us.conn | {"gcs_metadata":{"content_type":"image/jpeg","md5_hash"… | +----------------------------+-----------------+--------------------+----------------------------------------------------------+ | gs://mybucket/horse.bmp | 23456789 | us.conn | {"gcs_metadata":{"content_type":"image/bmp","md5_hash"… | +----------------------------+-----------------+--------------------+----------------------------------------------------------+ | gs://mybucket/spider.jpeg | 234567890 | us.conn | {"gcs_metadata":{"content_type":"image/jpeg","md5_hash"… | +----------------------------+-----------------+--------------------+----------------------------------------------------------+
To determine how many legs each animal has, call the
AI.GENERATE_INT
function and select the result
field in the output
by running the following query:
SELECT AI.GENERATE_INT(('How many legs does this ', OBJ.GET_ACCESS_URL(animals, 'r'), ' have?'), connection_id => 'us.test_connection', endpoint => 'gemini-2.0-flash').result FROM mydataset.animals;
The result is similar to the following:
+--------+ | result | +--------+ | 0 | | 4 | | 8 | +--------+
Locations
You can run AI.GENERATE_INT
in all of the
regions
that support Gemini models, and also in the US
and EU
multi-regions.
Quotas
See Vertex AI and Cloud AI service functions quotas and limits.
What's next
- For more information about using Vertex AI models to generate text and embeddings, see Generative AI overview.
- For more information about using Cloud AI APIs to perform AI tasks, see AI application overview.