11. Deep
Learning
Subset of neural networks with
many layers (deep) that excels in
processing complex data like
images, audio and text.
13. Gen(erative) AI
A part of deep learning focused
on creating new content like text,
images or music. In other words
it “generates” stuff.
14. LLM
Large Language Model
Type of Model trained on
massive amounts of text data. It
can answer questions and write
text.
Gemini is an LLM developed by
Google.
15. Model
A system that has been trained
to recognise patterns and make
predictions or decisions based
on data.
A tool that learns from examples
and uses that knowledge to
solve problems or perform a
task.
16. Model
Models can be tailored.
“Sit, stay, down” - your
average dog
But there are
fi
remen’s
dogs, K-9s, guide dogs
etc.
18. Multimodal
Model
A type of model that can
understand various inputs (e.g.
audio, image, video, text) and
generate various outputs (e.g.
audio, image and text)
gemini-2.0-flash-001 is a
multimodal model. (*coming soon)
https://ai.google.dev/gemini-api/docs/models/gemini#gemini-2.0-
fl
ash
20. Transformer
Transformer architecture is a
super-smart text processor.
It uses an attention mechanism
with which it can
fi
gure out how
words relate to each other.
“The cat chased the mouse, and
it ran away”
21. Attention
Attention assigns a weight to
each word based on its
relevance to others, scoring their
importance.
This helps the system to
determine relationships between
words and use that context to
determine meaning.
https://www.youtube.com/watch?v=KJtZARuO3JY
22. Hallucination
Models can generate information
that is incorrect, irrelevant or
completely made up even
though it might sound plausible.
Always do fact-checking.
26. Transactional
These interactions generate an
answer based on an input but
they are one-off. They do not
“remember” the conversation.
There’s no actual conversation
here.
27. Transactional
import { GoogleGenerativeAI } from '@google/
generative-ai';
const genAI = new
GoogleGenerativeAI(process.env.GEMINI_API_KEY
);
const model = genAI.getGenerativeModel(
{ model: 'gemini-1.5-flash' }
);
const prompt = 'What is Star Wars?';
const result = await
model.generateContent(prompt);
console.log(result.response.text());
31. Temperature
Parameter that controls the
randomness (creativity) of a
model’s responses.
Different models have different
temperature control parameters.
gemini-1.5-flash: 0.0 - 2.0 (default 1.0)
gemini-2.0-flash: 0.0 - 2.0 (default 0.7)
32. Temperature
What is the capital of Italy?
T0.2: “Rome”
T0.8: “Rome, the Eternal City”
T1.2: “Rome is the capital of Italy,
known for its ancient history and
landmarks like the Colosseum”
T1.6: “Rome, though Florence could claim
the title for its art and culture”
36. Chat
History can be maintained by
passing a history array to the
startChat method.
const history = [];
let result = await model
.startChat({ history })
.sendMessageStream(input);
37. Chat
Content needs to be associated
with a role. Gemini supports 2
roles:
user and model.
User is the role which provides
the prompts.
Model is the role that provides
the responses.
38. Chat
Note that the history array must
conform to the following
schema.
{
role: ‘user',
parts: [{
text: userInput
}]
}
39. Try it out
npm run chat
npm run chat-stream
npm run chat-with-history
40. Token limits
Models have token limits on both
input and output.
For example gemini-2.0-
flash-001 has 1,048,576 input
token limit and 8,192 output
token limit.
https://ai.google.dev/gemini-api/docs/models/gemini#gemini-2.0-
fl
ash
41. Model
parameters
Models take parameter values
that control how they generate a
response.
https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/adjust-parameter-values
42. Max output
tokens
How many tokens (words) should
be returned. Generally 100
tokens is about 60-80 words.
https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/adjust-parameter-values
43. Top-K
Top-K controls how a model
selects tokens by limiting the
choices to the K most probable
tokens.
Top-K = 1 means that the model
always selects the most probable
token.
Top-K = 3 randomly selects the top
3 most probable tokens.
https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/adjust-parameter-values
44. Top-P
Top-P dynamically limits token
selection to the smallest set of
words where the cumulative
probability meets a threshold.
Lower values make responses
more predictable. Higher values
consider less probably words.
https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/adjust-parameter-values
46. Safety
Block harmful content such as
harassment, hate, sexually
explicit language, dangerous
content, and content that could
jeopardise civic integrity.
https://ai.google.dev/gemini-api/docs/safety-settings
48. Context
What happens in multi-turn
conversational systems with the
context?
Often there’s a cutoff, we ask the
LLM to generate a summary and
store that summary so that the
model is aware of what was
discussed so far, without running
out of context.
49. Structured
Output
Instruct the model to respond
using a speci
fi
c format. The
format can be embedded in the
prompt or build a schema
programmatically.
https://ai.google.dev/gemini-api/docs/structured-output?hl=en&lang=node
50. Structured
Output
const schema = {
description: 'Array of original Star Wars films with details',
type: SchemaType.ARRAY,
items: {
type: SchemaType.OBJECT,
properties: {
title: {
type: SchemaType.STRING,
description: 'Title of the film',
nullable: false,
},
released: {
type: SchemaType.STRING,
description: 'Release date of the film',
nullable: false,
},
characters: {
type: SchemaType.STRING,
description: 'Notable characters in the film',
nullable: false,
},
plot: {
type: SchemaType.STRING,
description: "Short summary of the film's plot",
nullable: false,
},
},
required: ['title', 'released', 'characters', 'plot'],
},
};
https://ai.google.dev/gemini-api/docs/structured-output?hl=en&lang=node
52. Live
Information
Models are trained on vast
amount of data, and they have a
training cut-off date. In other
words, information that is inside
a model may be outdated.
Also, live information cannot be
retrieved by the models.
54. Function
Calling
Allows AI to call prede
fi
ned
functions to perform speci
fi
c
actions. These actions can be for
data retrieval or anything else.
A function is just (in our case) a
JavaScript function.
55. Function
Calling
const model = genAI.getGenerativeModel({
model: 'gemini-2.0-flash-exp',
tools: [
{
functionDeclarations: [
{
name: 'getWeather',
description:
'gets the weather for a city and returns the
forecast using the metric system.',
parameters: {
type: 'object',
properties: {
city: {
type: 'string',
description: 'the city for which the weather
is requested',
},
},
required: ['city'],
},
},
],
},
],
toolConfig: { functionCallingConfig: { mode: 'AUTO' } },
});
59. Vector
Vectorisation is the process of
turning words into vectors (list of
numbers)
The attention mechanism uses
the context to assign meaning to
a word allowing the model to
understand and process words
accurately.
62. Vector
Remember context matters.
Words like “deposit”, “money” is
a strong signal that bank is a
fi
nancial institution.
“Sat”, “river” and “by” suggest
bank means riverbank.
63. Embedding
An embedding is a way of
representing words as numerical
vectors.
For example
“king” [0.8, 0.1, 0.6]
“queen” [0.8, 0.1, 0.8]
“dog” [0.3, 0.8, 0.2]
64. Distance
The distance between the
vectors determines their how
“close” they are to one another.
For example
“king” [0.8, 0.1, 0.6]
“queen” [0.8, 0.1, 0.8]
“dog” [0.3, 0.8, 0.2]
65. Distance
Think about a product
recommendation. And let’s do
some super-simple maths.
Let’s say that Kiasu Karen
bought an apple 🍎 and a banana
🍌.
We take sweetness and texture
into consideration.
76. Embedding
Embeddings are important
because they allow for use-
cases that revolve around QA
systems, recommendation
systems and similarity searches.
Note searches can only be run
against Vector indices.
77. Embedding
1. Ingest data to BigQuery
2. Create a model
3. Create embeddings
4. Create index
5. Query
78. CREATE OR REPLACE VECTOR INDEX `idx`
ON `ai-workshop-448608.ai_workshop_films.films_with_embeddings`
(ml_generate_embedding_result)
OPTIONS(
distance_type='COSINE', index_type='IVF', ivf_options='{"num_lists": 10}'
)
CREATE OR REPLACE MODEL `ai-workshop-448608.ai_workshop_films.film_embedding`
REMOTE WITH CONNECTION `ai-workshop-448608.asia-southeast1.ai_workshop_connection`
OPTIONS(
endpoint = 'text-embedding-005'
);
CREATE OR REPLACE TABLE `ai-workshop-448608.ai_workshop_films.films_with_embeddings` AS (
SELECT *
FROM ML.GENERATE_EMBEDDING(
MODEL `ai-workshop-448608.ai_workshop_films.film_embedding`,
(
SELECT *, CONCAT(title, '', overview) AS content
FROM `ai-workshop-448608.ai_workshop_films.films`
)
)
);
https://www.youtube.com/watch?v=eztSNAZ0f_4
79. https://www.youtube.com/watch?v=eztSNAZ0f_4
SELECT * FROM VECTOR_SEARCH(
TABLE `ai-workshop-448608.ai_workshop_films.films_with_embeddings`,
'ml_generate_embedding_result',
(
SELECT ml_generate_embedding_result AS embedding_col FROM ML.GENERATE_EMBEDDING
(
MODEL `ai-workshop-448608.ai_workshop_films.film_embedding`,
(SELECT "Star Wars" AS content),
STRUCT(TRUE AS flatten_json_output)
)
)
,top_k => 5
);
80. Try it out
npm run recommendation “”
npm run recommendation-advanced
81. Similarity
Search
1. Embed the content
2. Embed the question/query
Return results based on
similarity (distance) matches.
83. Similarity
Search
User read and likes “Harry Potter
and the Prisoner of Azkaban”
Book Similarity Score
Harry Potter and the Cursed Child 0.98
Percy Jackson & The Olympians 0.85
The Magicians by Lev Grossman 0.82
The other two results have lower similarity
scores but are more relevant to what the user
originally read.
https://www.youtube.com/watch?v=o5_t6Ai--ws
84. Fine Tuning
Process of adapting a pre-
trained model for speci
fi
c tasks
or use-cases. Done by providing
examples to the model. (This is
also called supervised
fi
ne-
tuning)
https://ai.google.dev/gemini-api/docs/model-tuning
86. RAG
Retrieval-Augmented Generation
is a technique where retrieval is
done from source(s) speci
fi
ed by
the user and generation is done
using that information and
answering in natural language.
https://cloud.google.com/use-cases/retrieval-augmented-generation?hl=en
89. Agents
An AI agent is a smart assistant
that can observe a situation,
think about what to do and act to
achieve the desired outcome/
goal.
90. Agents
There are many types of agents -
some are just calling functions,
some systems can also be multi-
agent.
91. Agents
Multi agent systems can pass
work to each other and
effectively communicate with
each other to react to a task.
92. Agentic
RAG
Agentic RAG combines AI agents
with RAG.
It initially retrieves answers from
speci
fi
ed data sources but can
autonomously expand its search
via function calling if no relevant
information is found.
97. Generative
Fill
Uses a method called inpainting,
where AI looks at the colours,
textures and patterns around the
image and generates new
content and blends that with the
rest of the image.
101. Try it out
npm run ecommerce
https://videoapi.cloudinary.com/video-demo/
video-smart-cropping
102. C2PA
C2PA (Coalition for Content
Provenance and Authenticity) is a
standard developed to provide a
framework for verifying
authenticity and provenance of
digital content.
105. Evals
Evals are a method for testing AI
systems to assess their
performance and reliability. They
are crucial in production to
measure how well a model
functions and responds to
different inputs.
https://cloud.google.com/vertex-ai/generative-ai/docs/models/determine-eval
106. Evals
A basic eval could be: “Did the AI
call the function when it had to”?
Score 1 - yes, Score 0 - no
Learn more
https://github.com/
braintrustdata/autoevals