SlideShare a Scribd company logo
Data Analysis
with TensorFlow
in PostgreSQL
Dave Page
12 May 2021
Dave Page
● EDB (CTO Office)
○ VP & Chief Architect, Database Infrastructure
● PostgreSQL
○ Core Team
○ pgAdmin Lead Developer
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
In this talk...
3
● What are PostgreSQL, pl/python3 and TensorFlow?
● Why would I use them together?
● Examples of analysis types.
● Calling TensorFlow from PostgreSQL.
● Preparing data.
● Designing a network.
● Training a model.
● Performing analysis.
Software
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
What is PostgreSQL?
5
50,000 foot overview
● Relational, SQL based database.
● Fully enterprise ready; increasingly replacing Oracle, SQL Server, DB2 and more.
● Used in pretty much every sector: government, law enforcement, financial, healthcare…
● Possibly the most SQL Standard compliant database there is.
● Highly extensible:
○ Plugin extension modules.
○ Plugin procedural languages (e.g. Python, Perl, R, Java, v8).
○ Low level code hooks.
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
What is pl/python3?
6
50,000 foot overview
● Procedural language for PostgreSQL.
● Write stored procedures, functions and anonymous blocks within your database.
● Supports Python 3:
○ Don’t try to use pl/python, which uses the now-obsolete Python 2!
● The vast Python ecosystem of libraries may be used.
● Combines the power of Python with PostgreSQL.
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
What is TensorFlow?
7
50,000 foot overview
● Open Source Machine Learning library.
● Originated from the Google Brain team.
● Extremely powerful and flexible.
● Supports a variety of languages:
○ Python
○ C/C++
○ R
○ Javascript
○ …
● Library of pre-built models and datasets.
● Supports distributed learning.
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Why?
8
Not just for fun
● Our data is already in the database.
● We can easily use the power of SQL to choose and format data for analysis:
○ SQL is designed for working with datasets:
■ datum ~= scalar
■ tuple ~= vector
■ array/set ~= matrix/tensor
○ SELECT … FROM … WHERE …
○ Mathematical functions & operators: sqrt(), log(), power(), mod(), round()...
○ Aggregates and Window Functions, Common Table Expressions.
Analysis types
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Regression analysis
10
● Model relationships between input values (features) and outputs.
● Analyse new or hypothetical inputs and predict outputs.
● For example, house prices:
○ Inputs:
■ Number of bedrooms
■ Property type (detached, semi, flat etc.)
■ Property condition
■ Proximity to the beach
■ Proximity to major roads or a rail link to the city
■ Council tax cost
■ Number of nearby pubs serving CAMRA recommended beer
○ Output:
■ The price of the house
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Time series analysis
11
● Analyse time series data and make predictions.
● More powerful than linear analysis, predicting:
○ Linear trends (upwards or downwards)
○ Seasonal variability, e.g.
■ Summer is busier than winter.
■ Friday and Saturday night account for 60% of trade.
■ January is always the slowest month.
■ Multiple seasonalities can be predicted together.
○ Noise is inherently smoothed out, unless it overshadows trends and seasonal variations.
● Useful for multiple purposes:
○ Capacity management of application deployments.
○ Sales predictions.
○ Stock management.
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Other types of analysis
12
Not covered in this talk!
● Text prediction/generation.
● Text classification.
● Image classification.
● Object detection.
● Audio analysis.
● Speech recognition.
● The list goes on!
Getting set up
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Setting up pl/python3
14
● Install PostgreSQL:
○ If using EDB installers, use StackBuilder to install the LanguagePack.
○ On Linux, install the pl/python3 package, e.g. on Debian/Ubuntu: postgresql-plpython3-13.
● Run psql or pgAdmin, and execute:
○ CREATE EXTENSION plpython3;
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Setting up the Python environment
15
● Any Python libraries that will be used need to be added to the Python environment, using pip or the
OS package manager:
○ On Linux, using the system Python:
■ sudo pip3 install <package 1> …
○ On macOS, using the EDB LanguagePack:
■ sudo /Library/edb/languagepack/v1/Python-3.7/bin/pip install <package 1> …
○ On Window, using the EDB LanguagePack (as Administrator):
■ C:edblanguagepackv1Python-3.7binpip install <package 1> …
● Recommended starter packages:
○ tensorflow
○ numpy (will be installed automatically as a dependency of tensorflow)
○ pandas
○ matplotlib
○ seaborn
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
A brief introduction to pl/python3
16
A.K.A. Making sure it all works
Data preparation
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Preparing the data
18
● Cleanup:
○ Goal: maximise the accuracy of the model.
○ Method: eliminate data that might skew results.
○ Requires: analysis and understanding of existing data.
○ Applies mostly to regression analysis where we're trying to model a relationship, rather than time series.
● Multiple data sets:
○ Training data is used to teach the model.
○ Validation data is used during training to validate what has been learnt.
○ Test data is optionally used to test the model.
○ Training vs. validation data is typically randomly selected for regression analysis.
○ Training vs. validation data is typically sequential for time series analysis.
○ Ratio of training to validation (and test) data is usually skewed towards training, e.g. 3:1 or 4:1.
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Correlations
19
Analysis
● Some features have stronger correlations to the output than others.
● We can exclude uncorrelated or loosely correlated features to simplify the neural network (model)
and increase accuracy.
NOTICE: Correlation data:
crim zn indus chas nox rm age dis rad tax ptratio b lstat medv
crim 1.000000 -0.200469 0.406583 -0.055892 0.420972 -0.219247 0.352734 -0.379670 0.625505 0.582764 0.289946 -0.385064 0.455621 -0.388305
zn -0.200469 1.000000 -0.533828 -0.042697 -0.516604 0.311991 -0.569537 0.664408 -0.311948 -0.314563 -0.391679 0.175520 -0.412995 0.360445
indus 0.406583 -0.533828 1.000000 0.062938 0.763651 -0.391676 0.644779 -0.708027 0.595129 0.720760 0.383248 -0.356977 0.603800 -0.483725
chas -0.055892 -0.042697 0.062938 1.000000 0.091203 0.091251 0.086518 -0.099176 -0.007368 -0.035587 -0.121515 0.048788 -0.053929 0.175260
nox 0.420972 -0.516604 0.763651 0.091203 1.000000 -0.302188 0.731470 -0.769230 0.611441 0.668023 0.188933 -0.380051 0.590879 -0.427321
rm -0.219247 0.311991 -0.391676 0.091251 -0.302188 1.000000 -0.240265 0.205246 -0.209847 -0.292048 -0.355501 0.128069 -0.613808 0.695360
age 0.352734 -0.569537 0.644779 0.086518 0.731470 -0.240265 1.000000 -0.747881 0.456022 0.506456 0.261515 -0.273534 0.602339 -0.376955
dis -0.379670 0.664408 -0.708027 -0.099176 -0.769230 0.205246 -0.747881 1.000000 -0.494588 -0.534432 -0.232471 0.291512 -0.496996 0.249929
rad 0.625505 -0.311948 0.595129 -0.007368 0.611441 -0.209847 0.456022 -0.494588 1.000000 0.910228 0.464741 -0.444413 0.488676 -0.381626
tax 0.582764 -0.314563 0.720760 -0.035587 0.668023 -0.292048 0.506456 -0.534432 0.910228 1.000000 0.460853 -0.441808 0.543993 -0.468536
ptratio 0.289946 -0.391679 0.383248 -0.121515 0.188933 -0.355501 0.261515 -0.232471 0.464741 0.460853 1.000000 -0.177383 0.374044 -0.507787
b -0.385064 0.175520 -0.356977 0.048788 -0.380051 0.128069 -0.273534 0.291512 -0.444413 -0.441808 -0.177383 1.000000 -0.366087 0.333461
lstat 0.455621 -0.412995 0.603800 -0.053929 0.590879 -0.613808 0.602339 -0.496996 0.488676 0.543993 0.374044 -0.366087 1.000000 -0.737663
medv -0.388305 0.360445 -0.483725 0.175260 -0.427321 0.695360 -0.376955 0.249929 -0.381626 -0.468536 -0.507787 0.333461 -0.737663 1.000000
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Eliminating outliers
20
Analysis
● Outlier values in the training/validation data can make it harder to build an accurate model.
● Analyse the input features and automatically remove rows with outliers using an algorithm such as
interquartile range (IQR), i.e. those values that sit in the first or fourth quartile of distribution:
NOTICE: Outliers detected using IQR:
row crim zn indus chas nox rm age dis rad tax ptratio b lstat medv
0 False False False False False False False False False False False False False False
1 False False False False False False False False False False False False False False
2 False False False False False False False False False False False False False False
3 False False False False False False False False False False False False False False
...
18 False False False False False False False False False False False True False False
19 False False False False False False False False False False False False False False
...
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Eliminating outliers
21
Example code
# Outlier detection
# Note: 'data' is a Pandas dataframe containing our raw data
Q1 = data.quantile(0.25)
Q3 = data.quantile(0.75)
IQR = Q3 - Q1
plpy.notice('Outliers detected using IQR:n{}n'.
format((data < (Q1 - 1.5 * IQR)) | (data > (Q3 + 1.5 * IQR))))
# Outlier Removal
plpy.notice('Removing outliers...')
data = data[~((data < (Q1 - 1.5 * IQR)) | (data > (Q3 + 1.5 * IQR))).any(axis=1)]
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Visualisation
22
Everyone likes a pretty picture
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Creating data sets
23
Example code
# Figure out how many rows to use for training, validation and test
test_rows = int((actual_rows/100) * test_pct)
validation_rows = int((actual_rows/100) * validation_pct)
training_rows = actual_rows - test_rows - validation_rows
# Split the data into input and output dataframes (the last column is the output)
input = data[columns[:-1]]
output = data[columns[-1:]]
# Split the input and output into training, validation and test sets
training_input = input[:training_rows]
training_output = output[:training_rows]
validation_input = input[training_rows:training_rows+validation_rows]
validation_output = output[training_rows:training_rows+validation_rows]
test_input = input[training_rows+validation_rows:]
test_output = output[training_rows+validation_rows:]
Building
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Designing a model
25
● A model is an interconnected layered network of known mathematical functions with trainable
parameters (or filters); a.k.a. a neural network.
● Different model architectures are suited to different types of task:
○ Regression might use a simple network with multiple layers:
■ The number of input filters matches the number of input features.
■ Inner layers can be constructed as desired for best results; often based on trial and error and experience.
■ The number of output filters matches the number of outputs.
■ Layers are dense; an activation function allows modelling of non-linear functions.
○ The WaveNet architecture is well suited to time series analysis, despite being originally designed for audio
analysis:
■ A single filter on the input layer.
■ Multiple layers of filters with increasing dilation to detect seasonal patterns, e.g. 2, 4, 8, 16, 32.
■ A single filter on the output layer.
■ Layers are convolutional; all filters in one layer connect to all filters in the next.
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Creating the model
26
Regression analysis
# Define the model
# 2 layers of 13 filters for the input features, and one layer of one filter for the output
l1 = tf.keras.layers.Dense(units=13, input_shape=(2,), activation = 'relu')
l2 = tf.keras.layers.Dense(units=13, activation = 'relu')
l3 = tf.keras.layers.Dense(units=1))
model = tf.keras.Sequential([l1, l2, l3])
# Compile it
model.compile(loss=tf.keras.losses.MeanSquaredError(),
optimizer='adam')
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Creating the model
27
Time series analysis
# Define the model
model = keras.models.Sequential()
# Input layer
model.add(keras.layers.InputLayer(input_shape=[None, 1]))
# Add multiple 1D convolutional layers with increasing dilation rates to
# allow each layer to detect patterns over longer time frequencies
for dilation_rate in (1, 2, 4, 8, 16, 32):
model.add(keras.layers.Conv1D(filters=32, kernel_size=2, strides=1,
dilation_rate=dilation_rate, padding="causal", activation="relu"))
# Add one output layer, with 1 filter to give us one output per time step
model.add(keras.layers.Conv1D(filters=1, kernel_size=1))
# Create a learning optimiser and compile the model
optimizer = keras.optimizers.Adam(lr=3e-4)
model.compile(loss=keras.losses.Huber(), optimizer=optimizer, metrics=["mae"])
Training
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Training the model
29
● Training is repeated multiple times (or epochs), hopefully improving each time:
○ The training data set is used for learning.
○ The validation data set is used to validate results during training.
○ The test data is optionally used to test the model after training.
● We monitor a metric to assess how well the network is learning:
○ For regression, I've had success with Mean Squared Error (which I monitor as Root Mean Squared Error).
○ For time series, Huber loss works well (it's less sensitive to outliers than MSE).
● A callback is used to checkpoint (save) the model each time we see a better accuracy than any
previous epoch.
● With regression analysis, we use an 'early stopping' callback to exit the training epoch loop when
no further significant improvement is made, to prevent the network learning the training data
rather than the mathematical relationship.
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Training the model
30
Regression analysis
# Save a checkpoint each time our loss metric improves.
checkpoint = ModelCheckpoint("checkpoint.h5", save_best_only=True)
# Use early stopping
early_stopping = EarlyStopping(patience=50)
# Display output. This would go to stdout automatically if we weren't using pl/python
logger = LambdaCallback(
on_epoch_end=lambda epoch,
logs: plpy.notice(
'epoch: {}, training RMSE: {} ({}%), validation RMSE: {} ({}%)'.format(
epoch,
sqrt(logs['loss']), round(100 / max_z * sqrt(logs['loss']), 5),
sqrt(logs['val_loss']), round(100 / max_z * sqrt(logs['val_loss']), 5))))
# Train it!
history = model.fit(training_input, training_output,
validation_data=(validation_input, validation_output),
epochs=epochs, verbose=False, batch_size=50,
callbacks=[logger, checkpoint, early_stopping])
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Training the model
31
Time series analysis
# Save checkpoints when we get the best model
model_checkpoint = keras.callbacks.ModelCheckpoint("checkpoint.h5", save_best_only=True)
# Use early stopping to prevent over fitting
early_stopping = keras.callbacks.EarlyStopping(patience=50)
# Display output. This would go to stdout automatically if we weren't using pl/python
logger = LambdaCallback(
on_epoch_end=lambda epoch,
logs: plpy.notice(
'epoch: {}, training RMSE: {} ({}%), validation RMSE: {} ({}%)'.format(
epoch,
sqrt(logs['loss']), round(100 / max_z * sqrt(logs['loss']), 5),
sqrt(logs['val_loss']), round(100 / max_z * sqrt(logs['val_loss']), 5))))
# Train it!
history = model.fit(train_set, epochs=100,
validation_data=valid_set,
callbacks=[early_stopping, logger, model_checkpoint])
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Use once vs. use many
32
● Each model is trained with a specific data set.
● With regression analysis, we can re-use a model with any input features to predict an output:
○ In practice this means we might use the model repeatedly over time to model different inputs.
● With time series analysis we can reuse the model to predict different timeframes:
○ In practice, this means we might only use a model once when performing time series analysis.
● Models can be 're-trained' as new data becomes available:
○ If the data distribution has changed, the model might degrade.
○ It may be preferable to re-train from scratch.
● For complex problems, it may be useful to start with a suitable pre-trained generic model, and
continue training with specific data:
○ This is known as transfer learning.
Using
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Using the model
34
Regression analysis
CREATE OR REPLACE FUNCTION public.rg_analysis(
input_values double precision[],
model_path text)
RETURNS double precision[]
LANGUAGE 'plpython3u'
AS $BODY$
import tensorflow as tf
# Reset everything
tf.keras.backend.clear_session()
tf.random.set_seed(42)
# Load the model
model = tf.keras.models.load_model("checkpoint.h5")
# Are we dealing with a single prediction,
# or a list of them?
if not any(isinstance(sub, list) for sub in
input_values):
data = [input_values]
else:
data = input_values
# Make the prediction(s)
result = model.predict([data])[0]
result = [ item for elem in result for item in elem]
return result
$BODY$;
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Using the model
35
Time series analysis
# Load the best model from the last checkpoint
model = keras.models.load_model("checkpoint.h5")
cnn_forecast = model_forecast(model,
series[..., np.newaxis],
window_size)
cnn_forecast = cnn_forecast[train_samples - window_size:-1, -1, 0]
plt.figure(figsize=(10, 6))
plot_series(dates,
np.concatenate([series[:train_samples],
np.full(valid_samples, None, dtype=float)]),
label="Training Data")
plot_series(dates,
np.concatenate([np.full(train_samples, None, dtype=float),
series[train_samples:]]),
label="Validation Data")
plot_series(dates,
np.concatenate([np.full(train_samples, None, dtype=float),
cnn_forecast]),
label="Forecast Data")
plt.savefig('ts_analysis.png')
Conclusion
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Summary
37
In this talk:
● We introduced PostgreSQL, TensorFlow and pl/python3.
● Discussed why we might use them together.
● Introduced two (of many) types of analysis we can perform:
○ Regression.
○ Time Series.
● Showed how we can call TensorFlow from PostgreSQL using pl/python3.
● Walked through the main steps of performing an analysis, considering regression and time series
problems:
○ Preparing the data.
○ Creating a model.
○ Training the model.
○ Using the model.
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Questions and resources
38
Questions?
● EDB blog, includes posts on machine learning and other topics:
○ https://www.enterprisedb.com/dave-page
● Experimental code from my ML/AI journey:
○ https://github.com/dpage/ml-experiments
● Other resources:
○ https://www.postgresql.org
○ https://www.tensorflow.org
○ https://www.postgresql.org/docs/current/plpython.html
○ https://pandas.pydata.org
○ https://numpy.org
○ https://matplotlib.org
○ https://seaborn.pydata.org

More Related Content

What's hot (20)

How YugaByte DB Implements Distributed PostgreSQL
How YugaByte DB Implements Distributed PostgreSQLHow YugaByte DB Implements Distributed PostgreSQL
How YugaByte DB Implements Distributed PostgreSQL
Yugabyte
 
Cilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDPCilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDP
Thomas Graf
 
Kafka Retry and DLQ
Kafka Retry and DLQKafka Retry and DLQ
Kafka Retry and DLQ
George Teo
 
New Concepts: Nomens and Appellations
New Concepts: Nomens and AppellationsNew Concepts: Nomens and Appellations
New Concepts: Nomens and Appellations
ALAeLearningSolutions
 
Kubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep DiveKubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep Dive
Michal Rostecki
 
HA環境構築のベスト・プラクティス
HA環境構築のベスト・プラクティスHA環境構築のベスト・プラクティス
HA環境構築のベスト・プラクティス
EnterpriseDB
 
Elastic Cloud: The best way to experience everything Elastic
Elastic Cloud: The best way to experience everything ElasticElastic Cloud: The best way to experience everything Elastic
Elastic Cloud: The best way to experience everything Elastic
Elasticsearch
 
Automate Your Kafka Cluster with Kubernetes Custom Resources
Automate Your Kafka Cluster with Kubernetes Custom Resources Automate Your Kafka Cluster with Kubernetes Custom Resources
Automate Your Kafka Cluster with Kubernetes Custom Resources
confluent
 
Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, Uber
Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, UberKafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, Uber
Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, Uber
HostedbyConfluent
 
Oracleのソース・ターゲットエンドポイントとしての利用
Oracleのソース・ターゲットエンドポイントとしての利用Oracleのソース・ターゲットエンドポイントとしての利用
Oracleのソース・ターゲットエンドポイントとしての利用
QlikPresalesJapan
 
Migration from Oracle to PostgreSQL: NEED vs REALITY
Migration from Oracle to PostgreSQL: NEED vs REALITYMigration from Oracle to PostgreSQL: NEED vs REALITY
Migration from Oracle to PostgreSQL: NEED vs REALITY
Ashnikbiz
 
Mlag invisibile layer 2 redundancy
Mlag invisibile layer 2 redundancyMlag invisibile layer 2 redundancy
Mlag invisibile layer 2 redundancy
Cumulus Networks
 
Topic # 12 of outline Configuring Local Services.pptx
Topic # 12 of outline Configuring Local Services.pptxTopic # 12 of outline Configuring Local Services.pptx
Topic # 12 of outline Configuring Local Services.pptx
AyeCS11
 
AlloyDBを触ってみた!(第33回PostgreSQLアンカンファレンス@オンライン 発表資料)
AlloyDBを触ってみた!(第33回PostgreSQLアンカンファレンス@オンライン 発表資料)AlloyDBを触ってみた!(第33回PostgreSQLアンカンファレンス@オンライン 発表資料)
AlloyDBを触ってみた!(第33回PostgreSQLアンカンファレンス@オンライン 発表資料)
NTT DATA Technology & Innovation
 
Monitoring MongoDB Atlas with Datadog
Monitoring MongoDB Atlas with DatadogMonitoring MongoDB Atlas with Datadog
Monitoring MongoDB Atlas with Datadog
MongoDB
 
自律型データベース Oracle Autonomous Database 最新情報
自律型データベース Oracle Autonomous Database 最新情報自律型データベース Oracle Autonomous Database 最新情報
自律型データベース Oracle Autonomous Database 最新情報
オラクルエンジニア通信
 
08. networking
08. networking08. networking
08. networking
Muhammad Ahad
 
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RCThe Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
Thomas Graf
 
GPU
GPUGPU
GPU
Hamid Bluri
 
How to use 23c AHF AIOPS to protect Oracle Databases 23c
How to use 23c AHF AIOPS to protect Oracle Databases 23c How to use 23c AHF AIOPS to protect Oracle Databases 23c
How to use 23c AHF AIOPS to protect Oracle Databases 23c
Sandesh Rao
 
How YugaByte DB Implements Distributed PostgreSQL
How YugaByte DB Implements Distributed PostgreSQLHow YugaByte DB Implements Distributed PostgreSQL
How YugaByte DB Implements Distributed PostgreSQL
Yugabyte
 
Cilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDPCilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDP
Thomas Graf
 
Kafka Retry and DLQ
Kafka Retry and DLQKafka Retry and DLQ
Kafka Retry and DLQ
George Teo
 
New Concepts: Nomens and Appellations
New Concepts: Nomens and AppellationsNew Concepts: Nomens and Appellations
New Concepts: Nomens and Appellations
ALAeLearningSolutions
 
Kubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep DiveKubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep Dive
Michal Rostecki
 
HA環境構築のベスト・プラクティス
HA環境構築のベスト・プラクティスHA環境構築のベスト・プラクティス
HA環境構築のベスト・プラクティス
EnterpriseDB
 
Elastic Cloud: The best way to experience everything Elastic
Elastic Cloud: The best way to experience everything ElasticElastic Cloud: The best way to experience everything Elastic
Elastic Cloud: The best way to experience everything Elastic
Elasticsearch
 
Automate Your Kafka Cluster with Kubernetes Custom Resources
Automate Your Kafka Cluster with Kubernetes Custom Resources Automate Your Kafka Cluster with Kubernetes Custom Resources
Automate Your Kafka Cluster with Kubernetes Custom Resources
confluent
 
Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, Uber
Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, UberKafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, Uber
Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, Uber
HostedbyConfluent
 
Oracleのソース・ターゲットエンドポイントとしての利用
Oracleのソース・ターゲットエンドポイントとしての利用Oracleのソース・ターゲットエンドポイントとしての利用
Oracleのソース・ターゲットエンドポイントとしての利用
QlikPresalesJapan
 
Migration from Oracle to PostgreSQL: NEED vs REALITY
Migration from Oracle to PostgreSQL: NEED vs REALITYMigration from Oracle to PostgreSQL: NEED vs REALITY
Migration from Oracle to PostgreSQL: NEED vs REALITY
Ashnikbiz
 
Mlag invisibile layer 2 redundancy
Mlag invisibile layer 2 redundancyMlag invisibile layer 2 redundancy
Mlag invisibile layer 2 redundancy
Cumulus Networks
 
Topic # 12 of outline Configuring Local Services.pptx
Topic # 12 of outline Configuring Local Services.pptxTopic # 12 of outline Configuring Local Services.pptx
Topic # 12 of outline Configuring Local Services.pptx
AyeCS11
 
AlloyDBを触ってみた!(第33回PostgreSQLアンカンファレンス@オンライン 発表資料)
AlloyDBを触ってみた!(第33回PostgreSQLアンカンファレンス@オンライン 発表資料)AlloyDBを触ってみた!(第33回PostgreSQLアンカンファレンス@オンライン 発表資料)
AlloyDBを触ってみた!(第33回PostgreSQLアンカンファレンス@オンライン 発表資料)
NTT DATA Technology & Innovation
 
Monitoring MongoDB Atlas with Datadog
Monitoring MongoDB Atlas with DatadogMonitoring MongoDB Atlas with Datadog
Monitoring MongoDB Atlas with Datadog
MongoDB
 
自律型データベース Oracle Autonomous Database 最新情報
自律型データベース Oracle Autonomous Database 最新情報自律型データベース Oracle Autonomous Database 最新情報
自律型データベース Oracle Autonomous Database 最新情報
オラクルエンジニア通信
 
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RCThe Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
Thomas Graf
 
How to use 23c AHF AIOPS to protect Oracle Databases 23c
How to use 23c AHF AIOPS to protect Oracle Databases 23c How to use 23c AHF AIOPS to protect Oracle Databases 23c
How to use 23c AHF AIOPS to protect Oracle Databases 23c
Sandesh Rao
 

Similar to Data Analysis with TensorFlow in PostgreSQL (20)

Massively Parallel Processing with Procedural Python by Ronert Obst PyData Be...
Massively Parallel Processing with Procedural Python by Ronert Obst PyData Be...Massively Parallel Processing with Procedural Python by Ronert Obst PyData Be...
Massively Parallel Processing with Procedural Python by Ronert Obst PyData Be...
PyData
 
Data Science With Python | Python For Data Science | Python Data Science Cour...
Data Science With Python | Python For Data Science | Python Data Science Cour...Data Science With Python | Python For Data Science | Python Data Science Cour...
Data Science With Python | Python For Data Science | Python Data Science Cour...
Simplilearn
 
Meetup Junio Data Analysis with python 2018
Meetup Junio Data Analysis with python 2018Meetup Junio Data Analysis with python 2018
Meetup Junio Data Analysis with python 2018
DataLab Community
 
Congrats ! You got your Data Science Job
Congrats ! You got your Data Science JobCongrats ! You got your Data Science Job
Congrats ! You got your Data Science Job
Rohit Dubey
 
EDA.pptx
EDA.pptxEDA.pptx
EDA.pptx
Rahul Borate
 
Python Advanced Predictive Analytics Kumar Ashish
Python Advanced Predictive Analytics Kumar AshishPython Advanced Predictive Analytics Kumar Ashish
Python Advanced Predictive Analytics Kumar Ashish
dakorarampse
 
Machine Learning for Capacity Management
 Machine Learning for Capacity Management Machine Learning for Capacity Management
Machine Learning for Capacity Management
EDB
 
To understand the importance of Python libraries in data analysis.
To understand the importance of Python libraries in data analysis.To understand the importance of Python libraries in data analysis.
To understand the importance of Python libraries in data analysis.
GurpinderSingh98
 
Data Science_Unit-1.2 part - 2 of intro.pptx
Data Science_Unit-1.2 part - 2 of intro.pptxData Science_Unit-1.2 part - 2 of intro.pptx
Data Science_Unit-1.2 part - 2 of intro.pptx
sagarrathore52204
 
All thingspython@pivotal
All thingspython@pivotalAll thingspython@pivotal
All thingspython@pivotal
Srivatsan Ramanujam
 
Data Science Amsterdam - Massively Parallel Processing with Procedural Languages
Data Science Amsterdam - Massively Parallel Processing with Procedural LanguagesData Science Amsterdam - Massively Parallel Processing with Procedural Languages
Data Science Amsterdam - Massively Parallel Processing with Procedural Languages
Ian Huston
 
Pivotal Data Labs - Technology and Tools in our Data Scientist's Arsenal
Pivotal Data Labs - Technology and Tools in our Data Scientist's Arsenal Pivotal Data Labs - Technology and Tools in our Data Scientist's Arsenal
Pivotal Data Labs - Technology and Tools in our Data Scientist's Arsenal
Srivatsan Ramanujam
 
Introduction to Data Analytics.pptx
Introduction to Data Analytics.pptxIntroduction to Data Analytics.pptx
Introduction to Data Analytics.pptx
DikshantSharma63
 
More on Pandas.pptx
More on Pandas.pptxMore on Pandas.pptx
More on Pandas.pptx
VirajPathania1
 
Pivotal OSS meetup - MADlib and PivotalR
Pivotal OSS meetup - MADlib and PivotalRPivotal OSS meetup - MADlib and PivotalR
Pivotal OSS meetup - MADlib and PivotalR
go-pivotal
 
Lecture 1 Pandas Basics.pptx machine learning
Lecture 1 Pandas Basics.pptx machine learningLecture 1 Pandas Basics.pptx machine learning
Lecture 1 Pandas Basics.pptx machine learning
my6305874
 
Machine learning 101
Machine learning 101Machine learning 101
Machine learning 101
AmmarChalifah
 
Vectorized UDF: Scalable Analysis with Python and PySpark with Li Jin
Vectorized UDF: Scalable Analysis with Python and PySpark with Li JinVectorized UDF: Scalable Analysis with Python and PySpark with Li Jin
Vectorized UDF: Scalable Analysis with Python and PySpark with Li Jin
Databricks
 
Pandas UDF: Scalable Analysis with Python and PySpark
Pandas UDF: Scalable Analysis with Python and PySparkPandas UDF: Scalable Analysis with Python and PySpark
Pandas UDF: Scalable Analysis with Python and PySpark
Li Jin
 
An Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsAn Overview of Python for Data Analytics
An Overview of Python for Data Analytics
IRJET Journal
 
Massively Parallel Processing with Procedural Python by Ronert Obst PyData Be...
Massively Parallel Processing with Procedural Python by Ronert Obst PyData Be...Massively Parallel Processing with Procedural Python by Ronert Obst PyData Be...
Massively Parallel Processing with Procedural Python by Ronert Obst PyData Be...
PyData
 
Data Science With Python | Python For Data Science | Python Data Science Cour...
Data Science With Python | Python For Data Science | Python Data Science Cour...Data Science With Python | Python For Data Science | Python Data Science Cour...
Data Science With Python | Python For Data Science | Python Data Science Cour...
Simplilearn
 
Meetup Junio Data Analysis with python 2018
Meetup Junio Data Analysis with python 2018Meetup Junio Data Analysis with python 2018
Meetup Junio Data Analysis with python 2018
DataLab Community
 
Congrats ! You got your Data Science Job
Congrats ! You got your Data Science JobCongrats ! You got your Data Science Job
Congrats ! You got your Data Science Job
Rohit Dubey
 
Python Advanced Predictive Analytics Kumar Ashish
Python Advanced Predictive Analytics Kumar AshishPython Advanced Predictive Analytics Kumar Ashish
Python Advanced Predictive Analytics Kumar Ashish
dakorarampse
 
Machine Learning for Capacity Management
 Machine Learning for Capacity Management Machine Learning for Capacity Management
Machine Learning for Capacity Management
EDB
 
To understand the importance of Python libraries in data analysis.
To understand the importance of Python libraries in data analysis.To understand the importance of Python libraries in data analysis.
To understand the importance of Python libraries in data analysis.
GurpinderSingh98
 
Data Science_Unit-1.2 part - 2 of intro.pptx
Data Science_Unit-1.2 part - 2 of intro.pptxData Science_Unit-1.2 part - 2 of intro.pptx
Data Science_Unit-1.2 part - 2 of intro.pptx
sagarrathore52204
 
Data Science Amsterdam - Massively Parallel Processing with Procedural Languages
Data Science Amsterdam - Massively Parallel Processing with Procedural LanguagesData Science Amsterdam - Massively Parallel Processing with Procedural Languages
Data Science Amsterdam - Massively Parallel Processing with Procedural Languages
Ian Huston
 
Pivotal Data Labs - Technology and Tools in our Data Scientist's Arsenal
Pivotal Data Labs - Technology and Tools in our Data Scientist's Arsenal Pivotal Data Labs - Technology and Tools in our Data Scientist's Arsenal
Pivotal Data Labs - Technology and Tools in our Data Scientist's Arsenal
Srivatsan Ramanujam
 
Introduction to Data Analytics.pptx
Introduction to Data Analytics.pptxIntroduction to Data Analytics.pptx
Introduction to Data Analytics.pptx
DikshantSharma63
 
Pivotal OSS meetup - MADlib and PivotalR
Pivotal OSS meetup - MADlib and PivotalRPivotal OSS meetup - MADlib and PivotalR
Pivotal OSS meetup - MADlib and PivotalR
go-pivotal
 
Lecture 1 Pandas Basics.pptx machine learning
Lecture 1 Pandas Basics.pptx machine learningLecture 1 Pandas Basics.pptx machine learning
Lecture 1 Pandas Basics.pptx machine learning
my6305874
 
Machine learning 101
Machine learning 101Machine learning 101
Machine learning 101
AmmarChalifah
 
Vectorized UDF: Scalable Analysis with Python and PySpark with Li Jin
Vectorized UDF: Scalable Analysis with Python and PySpark with Li JinVectorized UDF: Scalable Analysis with Python and PySpark with Li Jin
Vectorized UDF: Scalable Analysis with Python and PySpark with Li Jin
Databricks
 
Pandas UDF: Scalable Analysis with Python and PySpark
Pandas UDF: Scalable Analysis with Python and PySparkPandas UDF: Scalable Analysis with Python and PySpark
Pandas UDF: Scalable Analysis with Python and PySpark
Li Jin
 
An Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsAn Overview of Python for Data Analytics
An Overview of Python for Data Analytics
IRJET Journal
 
Ad

More from EDB (20)

Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaSCloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
EDB
 
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr UnternehmenDie 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
EDB
 
Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube
EDB
 
EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021
EDB
 
Benchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQLBenchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQL
EDB
 
Las Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQLLas Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQL
EDB
 
NoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLNoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQL
EDB
 
Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?
EDB
 
Practical Partitioning in Production with Postgres
Practical Partitioning in Production with PostgresPractical Partitioning in Production with Postgres
Practical Partitioning in Production with Postgres
EDB
 
A Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAINA Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAIN
EDB
 
IOT with PostgreSQL
IOT with PostgreSQLIOT with PostgreSQL
IOT with PostgreSQL
EDB
 
A Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQLA Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQL
EDB
 
Psql is awesome!
Psql is awesome!Psql is awesome!
Psql is awesome!
EDB
 
EDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJEDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJ
EDB
 
Comment sauvegarder correctement vos données
Comment sauvegarder correctement vos donnéesComment sauvegarder correctement vos données
Comment sauvegarder correctement vos données
EDB
 
Cloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - ItalianoCloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - Italiano
EDB
 
New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13
EDB
 
Best Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQLBest Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQL
EDB
 
Cloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJCloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJ
EDB
 
Best Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQLBest Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQL
EDB
 
Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaSCloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
EDB
 
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr UnternehmenDie 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
EDB
 
Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube
EDB
 
EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021
EDB
 
Benchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQLBenchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQL
EDB
 
Las Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQLLas Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQL
EDB
 
NoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLNoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQL
EDB
 
Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?
EDB
 
Practical Partitioning in Production with Postgres
Practical Partitioning in Production with PostgresPractical Partitioning in Production with Postgres
Practical Partitioning in Production with Postgres
EDB
 
A Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAINA Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAIN
EDB
 
IOT with PostgreSQL
IOT with PostgreSQLIOT with PostgreSQL
IOT with PostgreSQL
EDB
 
A Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQLA Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQL
EDB
 
Psql is awesome!
Psql is awesome!Psql is awesome!
Psql is awesome!
EDB
 
EDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJEDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJ
EDB
 
Comment sauvegarder correctement vos données
Comment sauvegarder correctement vos donnéesComment sauvegarder correctement vos données
Comment sauvegarder correctement vos données
EDB
 
Cloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - ItalianoCloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - Italiano
EDB
 
New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13
EDB
 
Best Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQLBest Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQL
EDB
 
Cloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJCloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJ
EDB
 
Best Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQLBest Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQL
EDB
 
Ad

Recently uploaded (20)

“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
Edge AI and Vision Alliance
 
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
NTT DATA Technology & Innovation
 
Boosting MySQL with Vector Search -THE VECTOR SEARCH CONFERENCE 2025 .pdf
Boosting MySQL with Vector Search -THE VECTOR SEARCH CONFERENCE 2025 .pdfBoosting MySQL with Vector Search -THE VECTOR SEARCH CONFERENCE 2025 .pdf
Boosting MySQL with Vector Search -THE VECTOR SEARCH CONFERENCE 2025 .pdf
Alkin Tezuysal
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Shashikant Jagtap
 
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc Webinar - 2025 Global Privacy SurveyTrustArc Webinar - 2025 Global Privacy Survey
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc
 
Artificial Intelligence in the Nonprofit Boardroom.pdf
Artificial Intelligence in the Nonprofit Boardroom.pdfArtificial Intelligence in the Nonprofit Boardroom.pdf
Artificial Intelligence in the Nonprofit Boardroom.pdf
OnBoard
 
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and ImplementationAI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
Christine Shepherd
 
If You Use Databricks, You Definitely Need FME
If You Use Databricks, You Definitely Need FMEIf You Use Databricks, You Definitely Need FME
If You Use Databricks, You Definitely Need FME
Safe Software
 
Azure vs AWS Which Cloud Platform Is Best for Your Business in 2025
Azure vs AWS  Which Cloud Platform Is Best for Your Business in 2025Azure vs AWS  Which Cloud Platform Is Best for Your Business in 2025
Azure vs AWS Which Cloud Platform Is Best for Your Business in 2025
Infrassist Technologies Pvt. Ltd.
 
Enabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FMEEnabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FME
Safe Software
 
Edge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdfEdge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdf
AmirStern2
 
Oracle Cloud Infrastructure Generative AI Professional
Oracle Cloud Infrastructure Generative AI ProfessionalOracle Cloud Infrastructure Generative AI Professional
Oracle Cloud Infrastructure Generative AI Professional
VICTOR MAESTRE RAMIREZ
 
cnc-drilling-dowel-inserting-machine-drillteq-d-510-english.pdf
cnc-drilling-dowel-inserting-machine-drillteq-d-510-english.pdfcnc-drilling-dowel-inserting-machine-drillteq-d-510-english.pdf
cnc-drilling-dowel-inserting-machine-drillteq-d-510-english.pdf
AmirStern2
 
Secure Access with Azure Active Directory
Secure Access with Azure Active DirectorySecure Access with Azure Active Directory
Secure Access with Azure Active Directory
VICTOR MAESTRE RAMIREZ
 
Domino IQ – What to Expect, First Steps and Use Cases
Domino IQ – What to Expect, First Steps and Use CasesDomino IQ – What to Expect, First Steps and Use Cases
Domino IQ – What to Expect, First Steps and Use Cases
panagenda
 
Trends Artificial Intelligence - Mary Meeker
Trends Artificial Intelligence - Mary MeekerTrends Artificial Intelligence - Mary Meeker
Trends Artificial Intelligence - Mary Meeker
Clive Dickens
 
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
vertical-cnc-processing-centers-drillteq-v-200-en.pdfvertical-cnc-processing-centers-drillteq-v-200-en.pdf
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
AmirStern2
 
Introduction to Internet of things .ppt.
Introduction to Internet of things .ppt.Introduction to Internet of things .ppt.
Introduction to Internet of things .ppt.
hok12341073
 
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely
 
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
Edge AI and Vision Alliance
 
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
NTT DATA Technology & Innovation
 
Boosting MySQL with Vector Search -THE VECTOR SEARCH CONFERENCE 2025 .pdf
Boosting MySQL with Vector Search -THE VECTOR SEARCH CONFERENCE 2025 .pdfBoosting MySQL with Vector Search -THE VECTOR SEARCH CONFERENCE 2025 .pdf
Boosting MySQL with Vector Search -THE VECTOR SEARCH CONFERENCE 2025 .pdf
Alkin Tezuysal
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Shashikant Jagtap
 
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc Webinar - 2025 Global Privacy SurveyTrustArc Webinar - 2025 Global Privacy Survey
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc
 
Artificial Intelligence in the Nonprofit Boardroom.pdf
Artificial Intelligence in the Nonprofit Boardroom.pdfArtificial Intelligence in the Nonprofit Boardroom.pdf
Artificial Intelligence in the Nonprofit Boardroom.pdf
OnBoard
 
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and ImplementationAI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
Christine Shepherd
 
If You Use Databricks, You Definitely Need FME
If You Use Databricks, You Definitely Need FMEIf You Use Databricks, You Definitely Need FME
If You Use Databricks, You Definitely Need FME
Safe Software
 
Azure vs AWS Which Cloud Platform Is Best for Your Business in 2025
Azure vs AWS  Which Cloud Platform Is Best for Your Business in 2025Azure vs AWS  Which Cloud Platform Is Best for Your Business in 2025
Azure vs AWS Which Cloud Platform Is Best for Your Business in 2025
Infrassist Technologies Pvt. Ltd.
 
Enabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FMEEnabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FME
Safe Software
 
Edge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdfEdge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdf
AmirStern2
 
Oracle Cloud Infrastructure Generative AI Professional
Oracle Cloud Infrastructure Generative AI ProfessionalOracle Cloud Infrastructure Generative AI Professional
Oracle Cloud Infrastructure Generative AI Professional
VICTOR MAESTRE RAMIREZ
 
cnc-drilling-dowel-inserting-machine-drillteq-d-510-english.pdf
cnc-drilling-dowel-inserting-machine-drillteq-d-510-english.pdfcnc-drilling-dowel-inserting-machine-drillteq-d-510-english.pdf
cnc-drilling-dowel-inserting-machine-drillteq-d-510-english.pdf
AmirStern2
 
Secure Access with Azure Active Directory
Secure Access with Azure Active DirectorySecure Access with Azure Active Directory
Secure Access with Azure Active Directory
VICTOR MAESTRE RAMIREZ
 
Domino IQ – What to Expect, First Steps and Use Cases
Domino IQ – What to Expect, First Steps and Use CasesDomino IQ – What to Expect, First Steps and Use Cases
Domino IQ – What to Expect, First Steps and Use Cases
panagenda
 
Trends Artificial Intelligence - Mary Meeker
Trends Artificial Intelligence - Mary MeekerTrends Artificial Intelligence - Mary Meeker
Trends Artificial Intelligence - Mary Meeker
Clive Dickens
 
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
vertical-cnc-processing-centers-drillteq-v-200-en.pdfvertical-cnc-processing-centers-drillteq-v-200-en.pdf
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
AmirStern2
 
Introduction to Internet of things .ppt.
Introduction to Internet of things .ppt.Introduction to Internet of things .ppt.
Introduction to Internet of things .ppt.
hok12341073
 
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely
 

Data Analysis with TensorFlow in PostgreSQL

  • 1. Data Analysis with TensorFlow in PostgreSQL Dave Page 12 May 2021
  • 2. Dave Page ● EDB (CTO Office) ○ VP & Chief Architect, Database Infrastructure ● PostgreSQL ○ Core Team ○ pgAdmin Lead Developer
  • 3. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved In this talk... 3 ● What are PostgreSQL, pl/python3 and TensorFlow? ● Why would I use them together? ● Examples of analysis types. ● Calling TensorFlow from PostgreSQL. ● Preparing data. ● Designing a network. ● Training a model. ● Performing analysis.
  • 5. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved What is PostgreSQL? 5 50,000 foot overview ● Relational, SQL based database. ● Fully enterprise ready; increasingly replacing Oracle, SQL Server, DB2 and more. ● Used in pretty much every sector: government, law enforcement, financial, healthcare… ● Possibly the most SQL Standard compliant database there is. ● Highly extensible: ○ Plugin extension modules. ○ Plugin procedural languages (e.g. Python, Perl, R, Java, v8). ○ Low level code hooks.
  • 6. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved What is pl/python3? 6 50,000 foot overview ● Procedural language for PostgreSQL. ● Write stored procedures, functions and anonymous blocks within your database. ● Supports Python 3: ○ Don’t try to use pl/python, which uses the now-obsolete Python 2! ● The vast Python ecosystem of libraries may be used. ● Combines the power of Python with PostgreSQL.
  • 7. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved What is TensorFlow? 7 50,000 foot overview ● Open Source Machine Learning library. ● Originated from the Google Brain team. ● Extremely powerful and flexible. ● Supports a variety of languages: ○ Python ○ C/C++ ○ R ○ Javascript ○ … ● Library of pre-built models and datasets. ● Supports distributed learning.
  • 8. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Why? 8 Not just for fun ● Our data is already in the database. ● We can easily use the power of SQL to choose and format data for analysis: ○ SQL is designed for working with datasets: ■ datum ~= scalar ■ tuple ~= vector ■ array/set ~= matrix/tensor ○ SELECT … FROM … WHERE … ○ Mathematical functions & operators: sqrt(), log(), power(), mod(), round()... ○ Aggregates and Window Functions, Common Table Expressions.
  • 10. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Regression analysis 10 ● Model relationships between input values (features) and outputs. ● Analyse new or hypothetical inputs and predict outputs. ● For example, house prices: ○ Inputs: ■ Number of bedrooms ■ Property type (detached, semi, flat etc.) ■ Property condition ■ Proximity to the beach ■ Proximity to major roads or a rail link to the city ■ Council tax cost ■ Number of nearby pubs serving CAMRA recommended beer ○ Output: ■ The price of the house
  • 11. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Time series analysis 11 ● Analyse time series data and make predictions. ● More powerful than linear analysis, predicting: ○ Linear trends (upwards or downwards) ○ Seasonal variability, e.g. ■ Summer is busier than winter. ■ Friday and Saturday night account for 60% of trade. ■ January is always the slowest month. ■ Multiple seasonalities can be predicted together. ○ Noise is inherently smoothed out, unless it overshadows trends and seasonal variations. ● Useful for multiple purposes: ○ Capacity management of application deployments. ○ Sales predictions. ○ Stock management.
  • 12. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Other types of analysis 12 Not covered in this talk! ● Text prediction/generation. ● Text classification. ● Image classification. ● Object detection. ● Audio analysis. ● Speech recognition. ● The list goes on!
  • 14. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Setting up pl/python3 14 ● Install PostgreSQL: ○ If using EDB installers, use StackBuilder to install the LanguagePack. ○ On Linux, install the pl/python3 package, e.g. on Debian/Ubuntu: postgresql-plpython3-13. ● Run psql or pgAdmin, and execute: ○ CREATE EXTENSION plpython3;
  • 15. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Setting up the Python environment 15 ● Any Python libraries that will be used need to be added to the Python environment, using pip or the OS package manager: ○ On Linux, using the system Python: ■ sudo pip3 install <package 1> … ○ On macOS, using the EDB LanguagePack: ■ sudo /Library/edb/languagepack/v1/Python-3.7/bin/pip install <package 1> … ○ On Window, using the EDB LanguagePack (as Administrator): ■ C:edblanguagepackv1Python-3.7binpip install <package 1> … ● Recommended starter packages: ○ tensorflow ○ numpy (will be installed automatically as a dependency of tensorflow) ○ pandas ○ matplotlib ○ seaborn
  • 16. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved A brief introduction to pl/python3 16 A.K.A. Making sure it all works
  • 18. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Preparing the data 18 ● Cleanup: ○ Goal: maximise the accuracy of the model. ○ Method: eliminate data that might skew results. ○ Requires: analysis and understanding of existing data. ○ Applies mostly to regression analysis where we're trying to model a relationship, rather than time series. ● Multiple data sets: ○ Training data is used to teach the model. ○ Validation data is used during training to validate what has been learnt. ○ Test data is optionally used to test the model. ○ Training vs. validation data is typically randomly selected for regression analysis. ○ Training vs. validation data is typically sequential for time series analysis. ○ Ratio of training to validation (and test) data is usually skewed towards training, e.g. 3:1 or 4:1.
  • 19. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Correlations 19 Analysis ● Some features have stronger correlations to the output than others. ● We can exclude uncorrelated or loosely correlated features to simplify the neural network (model) and increase accuracy. NOTICE: Correlation data: crim zn indus chas nox rm age dis rad tax ptratio b lstat medv crim 1.000000 -0.200469 0.406583 -0.055892 0.420972 -0.219247 0.352734 -0.379670 0.625505 0.582764 0.289946 -0.385064 0.455621 -0.388305 zn -0.200469 1.000000 -0.533828 -0.042697 -0.516604 0.311991 -0.569537 0.664408 -0.311948 -0.314563 -0.391679 0.175520 -0.412995 0.360445 indus 0.406583 -0.533828 1.000000 0.062938 0.763651 -0.391676 0.644779 -0.708027 0.595129 0.720760 0.383248 -0.356977 0.603800 -0.483725 chas -0.055892 -0.042697 0.062938 1.000000 0.091203 0.091251 0.086518 -0.099176 -0.007368 -0.035587 -0.121515 0.048788 -0.053929 0.175260 nox 0.420972 -0.516604 0.763651 0.091203 1.000000 -0.302188 0.731470 -0.769230 0.611441 0.668023 0.188933 -0.380051 0.590879 -0.427321 rm -0.219247 0.311991 -0.391676 0.091251 -0.302188 1.000000 -0.240265 0.205246 -0.209847 -0.292048 -0.355501 0.128069 -0.613808 0.695360 age 0.352734 -0.569537 0.644779 0.086518 0.731470 -0.240265 1.000000 -0.747881 0.456022 0.506456 0.261515 -0.273534 0.602339 -0.376955 dis -0.379670 0.664408 -0.708027 -0.099176 -0.769230 0.205246 -0.747881 1.000000 -0.494588 -0.534432 -0.232471 0.291512 -0.496996 0.249929 rad 0.625505 -0.311948 0.595129 -0.007368 0.611441 -0.209847 0.456022 -0.494588 1.000000 0.910228 0.464741 -0.444413 0.488676 -0.381626 tax 0.582764 -0.314563 0.720760 -0.035587 0.668023 -0.292048 0.506456 -0.534432 0.910228 1.000000 0.460853 -0.441808 0.543993 -0.468536 ptratio 0.289946 -0.391679 0.383248 -0.121515 0.188933 -0.355501 0.261515 -0.232471 0.464741 0.460853 1.000000 -0.177383 0.374044 -0.507787 b -0.385064 0.175520 -0.356977 0.048788 -0.380051 0.128069 -0.273534 0.291512 -0.444413 -0.441808 -0.177383 1.000000 -0.366087 0.333461 lstat 0.455621 -0.412995 0.603800 -0.053929 0.590879 -0.613808 0.602339 -0.496996 0.488676 0.543993 0.374044 -0.366087 1.000000 -0.737663 medv -0.388305 0.360445 -0.483725 0.175260 -0.427321 0.695360 -0.376955 0.249929 -0.381626 -0.468536 -0.507787 0.333461 -0.737663 1.000000
  • 20. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Eliminating outliers 20 Analysis ● Outlier values in the training/validation data can make it harder to build an accurate model. ● Analyse the input features and automatically remove rows with outliers using an algorithm such as interquartile range (IQR), i.e. those values that sit in the first or fourth quartile of distribution: NOTICE: Outliers detected using IQR: row crim zn indus chas nox rm age dis rad tax ptratio b lstat medv 0 False False False False False False False False False False False False False False 1 False False False False False False False False False False False False False False 2 False False False False False False False False False False False False False False 3 False False False False False False False False False False False False False False ... 18 False False False False False False False False False False False True False False 19 False False False False False False False False False False False False False False ...
  • 21. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Eliminating outliers 21 Example code # Outlier detection # Note: 'data' is a Pandas dataframe containing our raw data Q1 = data.quantile(0.25) Q3 = data.quantile(0.75) IQR = Q3 - Q1 plpy.notice('Outliers detected using IQR:n{}n'. format((data < (Q1 - 1.5 * IQR)) | (data > (Q3 + 1.5 * IQR)))) # Outlier Removal plpy.notice('Removing outliers...') data = data[~((data < (Q1 - 1.5 * IQR)) | (data > (Q3 + 1.5 * IQR))).any(axis=1)]
  • 22. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Visualisation 22 Everyone likes a pretty picture
  • 23. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Creating data sets 23 Example code # Figure out how many rows to use for training, validation and test test_rows = int((actual_rows/100) * test_pct) validation_rows = int((actual_rows/100) * validation_pct) training_rows = actual_rows - test_rows - validation_rows # Split the data into input and output dataframes (the last column is the output) input = data[columns[:-1]] output = data[columns[-1:]] # Split the input and output into training, validation and test sets training_input = input[:training_rows] training_output = output[:training_rows] validation_input = input[training_rows:training_rows+validation_rows] validation_output = output[training_rows:training_rows+validation_rows] test_input = input[training_rows+validation_rows:] test_output = output[training_rows+validation_rows:]
  • 25. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Designing a model 25 ● A model is an interconnected layered network of known mathematical functions with trainable parameters (or filters); a.k.a. a neural network. ● Different model architectures are suited to different types of task: ○ Regression might use a simple network with multiple layers: ■ The number of input filters matches the number of input features. ■ Inner layers can be constructed as desired for best results; often based on trial and error and experience. ■ The number of output filters matches the number of outputs. ■ Layers are dense; an activation function allows modelling of non-linear functions. ○ The WaveNet architecture is well suited to time series analysis, despite being originally designed for audio analysis: ■ A single filter on the input layer. ■ Multiple layers of filters with increasing dilation to detect seasonal patterns, e.g. 2, 4, 8, 16, 32. ■ A single filter on the output layer. ■ Layers are convolutional; all filters in one layer connect to all filters in the next.
  • 26. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Creating the model 26 Regression analysis # Define the model # 2 layers of 13 filters for the input features, and one layer of one filter for the output l1 = tf.keras.layers.Dense(units=13, input_shape=(2,), activation = 'relu') l2 = tf.keras.layers.Dense(units=13, activation = 'relu') l3 = tf.keras.layers.Dense(units=1)) model = tf.keras.Sequential([l1, l2, l3]) # Compile it model.compile(loss=tf.keras.losses.MeanSquaredError(), optimizer='adam')
  • 27. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Creating the model 27 Time series analysis # Define the model model = keras.models.Sequential() # Input layer model.add(keras.layers.InputLayer(input_shape=[None, 1])) # Add multiple 1D convolutional layers with increasing dilation rates to # allow each layer to detect patterns over longer time frequencies for dilation_rate in (1, 2, 4, 8, 16, 32): model.add(keras.layers.Conv1D(filters=32, kernel_size=2, strides=1, dilation_rate=dilation_rate, padding="causal", activation="relu")) # Add one output layer, with 1 filter to give us one output per time step model.add(keras.layers.Conv1D(filters=1, kernel_size=1)) # Create a learning optimiser and compile the model optimizer = keras.optimizers.Adam(lr=3e-4) model.compile(loss=keras.losses.Huber(), optimizer=optimizer, metrics=["mae"])
  • 29. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Training the model 29 ● Training is repeated multiple times (or epochs), hopefully improving each time: ○ The training data set is used for learning. ○ The validation data set is used to validate results during training. ○ The test data is optionally used to test the model after training. ● We monitor a metric to assess how well the network is learning: ○ For regression, I've had success with Mean Squared Error (which I monitor as Root Mean Squared Error). ○ For time series, Huber loss works well (it's less sensitive to outliers than MSE). ● A callback is used to checkpoint (save) the model each time we see a better accuracy than any previous epoch. ● With regression analysis, we use an 'early stopping' callback to exit the training epoch loop when no further significant improvement is made, to prevent the network learning the training data rather than the mathematical relationship.
  • 30. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Training the model 30 Regression analysis # Save a checkpoint each time our loss metric improves. checkpoint = ModelCheckpoint("checkpoint.h5", save_best_only=True) # Use early stopping early_stopping = EarlyStopping(patience=50) # Display output. This would go to stdout automatically if we weren't using pl/python logger = LambdaCallback( on_epoch_end=lambda epoch, logs: plpy.notice( 'epoch: {}, training RMSE: {} ({}%), validation RMSE: {} ({}%)'.format( epoch, sqrt(logs['loss']), round(100 / max_z * sqrt(logs['loss']), 5), sqrt(logs['val_loss']), round(100 / max_z * sqrt(logs['val_loss']), 5)))) # Train it! history = model.fit(training_input, training_output, validation_data=(validation_input, validation_output), epochs=epochs, verbose=False, batch_size=50, callbacks=[logger, checkpoint, early_stopping])
  • 31. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Training the model 31 Time series analysis # Save checkpoints when we get the best model model_checkpoint = keras.callbacks.ModelCheckpoint("checkpoint.h5", save_best_only=True) # Use early stopping to prevent over fitting early_stopping = keras.callbacks.EarlyStopping(patience=50) # Display output. This would go to stdout automatically if we weren't using pl/python logger = LambdaCallback( on_epoch_end=lambda epoch, logs: plpy.notice( 'epoch: {}, training RMSE: {} ({}%), validation RMSE: {} ({}%)'.format( epoch, sqrt(logs['loss']), round(100 / max_z * sqrt(logs['loss']), 5), sqrt(logs['val_loss']), round(100 / max_z * sqrt(logs['val_loss']), 5)))) # Train it! history = model.fit(train_set, epochs=100, validation_data=valid_set, callbacks=[early_stopping, logger, model_checkpoint])
  • 32. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Use once vs. use many 32 ● Each model is trained with a specific data set. ● With regression analysis, we can re-use a model with any input features to predict an output: ○ In practice this means we might use the model repeatedly over time to model different inputs. ● With time series analysis we can reuse the model to predict different timeframes: ○ In practice, this means we might only use a model once when performing time series analysis. ● Models can be 're-trained' as new data becomes available: ○ If the data distribution has changed, the model might degrade. ○ It may be preferable to re-train from scratch. ● For complex problems, it may be useful to start with a suitable pre-trained generic model, and continue training with specific data: ○ This is known as transfer learning.
  • 33. Using
  • 34. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Using the model 34 Regression analysis CREATE OR REPLACE FUNCTION public.rg_analysis( input_values double precision[], model_path text) RETURNS double precision[] LANGUAGE 'plpython3u' AS $BODY$ import tensorflow as tf # Reset everything tf.keras.backend.clear_session() tf.random.set_seed(42) # Load the model model = tf.keras.models.load_model("checkpoint.h5") # Are we dealing with a single prediction, # or a list of them? if not any(isinstance(sub, list) for sub in input_values): data = [input_values] else: data = input_values # Make the prediction(s) result = model.predict([data])[0] result = [ item for elem in result for item in elem] return result $BODY$;
  • 35. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Using the model 35 Time series analysis # Load the best model from the last checkpoint model = keras.models.load_model("checkpoint.h5") cnn_forecast = model_forecast(model, series[..., np.newaxis], window_size) cnn_forecast = cnn_forecast[train_samples - window_size:-1, -1, 0] plt.figure(figsize=(10, 6)) plot_series(dates, np.concatenate([series[:train_samples], np.full(valid_samples, None, dtype=float)]), label="Training Data") plot_series(dates, np.concatenate([np.full(train_samples, None, dtype=float), series[train_samples:]]), label="Validation Data") plot_series(dates, np.concatenate([np.full(train_samples, None, dtype=float), cnn_forecast]), label="Forecast Data") plt.savefig('ts_analysis.png')
  • 37. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Summary 37 In this talk: ● We introduced PostgreSQL, TensorFlow and pl/python3. ● Discussed why we might use them together. ● Introduced two (of many) types of analysis we can perform: ○ Regression. ○ Time Series. ● Showed how we can call TensorFlow from PostgreSQL using pl/python3. ● Walked through the main steps of performing an analysis, considering regression and time series problems: ○ Preparing the data. ○ Creating a model. ○ Training the model. ○ Using the model.
  • 38. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Questions and resources 38 Questions? ● EDB blog, includes posts on machine learning and other topics: ○ https://www.enterprisedb.com/dave-page ● Experimental code from my ML/AI journey: ○ https://github.com/dpage/ml-experiments ● Other resources: ○ https://www.postgresql.org ○ https://www.tensorflow.org ○ https://www.postgresql.org/docs/current/plpython.html ○ https://pandas.pydata.org ○ https://numpy.org ○ https://matplotlib.org ○ https://seaborn.pydata.org