Showing posts with label google cloud functions tutorial python. Show all posts
Showing posts with label google cloud functions tutorial python. Show all posts

Wednesday, May 8, 2024

Cloud Functions - How to Read PDF Files on GCS Events and Store in BigQuery

In this tutorial, you will learn "How to create an event-driven Cloud Function that reads PDF files from Google Cloud Storage (GCS) and pushes their contents into BigQuery" in GCP.

Monday, February 12, 2024

Cloud Function - Load data into Big Query tables against GCS events

In this article, you will learn "How to Fire Cloud Functions on GCS object events to pull the CSV file into BigQuery Data Table" in Google Cloud Platform. Cloud Functions are a serverless computing service offered by Google Cloud Platform (GCP) which are an easy way to run your code in the cloud.


It supports Java, Python, Ruby, Node.js, Go, and .Net. Currently, Google Cloud Functions support events from the following providers- HTTP, Cloud Storage, Cloud Firestore, Pub/Sub, Firebase, and Stackdriver. Gen1 is more lightweight, one concurrency per instance, simple features and less knob to tweak, cheaper, it's pretty much deploy and forget, it is actually an AppEngine standard, while gen2 is on Cloud Run (on GKE), you have more control, up to 1k concurrency per instance, larger resources, longer timeouts, etc, If you don't need it, just use gen1. To complete the tasks outlined above, you must have a GCP account and appropriate access.

To accomplish this task, you can use Google Cloud Functions to trigger on Google Cloud Storage (GCS) object events and then pull the CSV file into a BigQuery data table. Here's a general outline of how you can do this:

  • Set up Google Cloud Functions: Create a Cloud Function that triggers on GCS object events. You can specify the event types (e.g., google.storage.object.finalize) to trigger the function when a new file is uploaded to a specific bucket.
  • Configure permissions: Ensure that your Cloud Function has the necessary permissions to access both GCS and BigQuery. You'll likely need to grant the Cloud Function service account permissions to read from GCS and write to BigQuery.
  • Write the Cloud Function code: Write the Cloud Function code to handle the GCS object event trigger. When a new CSV file is uploaded to GCS, the function should read the file, parse its content, and then insert the data into a BigQuery table.
  • Create a BigQuery table: Before inserting data into BigQuery, make sure you have a table created with the appropriate schema to match the CSV file structure.
  • Insert data into BigQuery: Use the BigQuery client library within your Cloud Function code to insert the data parsed from the CSV file into the BigQuery table.
For the actual demo, please visit us at our YouTube channel at -


To learn more, please follow us - 🔊 http://www.sql-datatools.com To Learn more, please visit our YouTube channel at — 🔊 http://www.youtube.com/c/Sql-datatools To Learn more, please visit our Instagram account at - 🔊 https://www.instagram.com/asp.mukesh/ To Learn more, please visit our twitter account at - 🔊 https://twitter.com/macxima

Thursday, February 8, 2024

Google Cloud Platform - How to Create Gen2 Cloud Function

In this article, you will learn how to Create Python based Gen2 Cloud function in Google Cloud Platform.

Cloud Functions are a serverless computing service offered by Google Cloud Platform (GCP) which are an easy way to run your code in the cloud. 



It supports Java, Python, Ruby, Node.js, Go, and .Net.

Currently, Google Cloud Functions support events from the following providers- HTTP, Cloud Storage, Cloud Firestore, Pub/Sub, Firebase, and Stackdriver.

Gen1 is more lightweight, one concurrency per instance, simple features and less knob to tweak, cheaper, it's pretty much deploy and forget, it is actually an AppEngine standard, 

while gen2 is on Cloud Run (on GKE), you have more control, up to 1k concurrency per instance, larger resources, longer timeouts, etc, If you don't need it, just use gen1.

To complete the tasks outlined above, you must have a GCP account and appropriate access. 




Data validation — Data validation is the process of checking the data against predefined rules and standards, such as data types, formats, ranges, and constraints.

  1. 💫Schema Validation: Verify data adherence to predefined schemas, checking types, formats, and structures.
  2. 💫Integrity Constraints: Enforce rules and constraints to maintain data integrity, preventing inconsistencies.
  3. 💫Cross-Field Validation: Validate relationships and dependencies between different fields to ensure logical coherence.
  4. 💫Data Quality Metrics: Define and track quality metrics, such as completeness, accuracy, and consistency.
  5. 💫Automated Validation Scripts: Develop and run automated scripts to check data against predefined rules and criteria.

 

To learn more, please follow us -
🔊 http://www.sql-datatools.com

To Learn more, please visit our YouTube channel at —
🔊 http://www.youtube.com/c/Sql-datatools

To Learn more, please visit our Instagram account at -
🔊 https://www.instagram.com/asp.mukesh/

To Learn more, please visit our twitter account at -
🔊
 https://twitter.com/macxima