Common solutions and tools developed by Google Cloud's Professional Services team.
The examples folder contains example solutions across a variety of Google Cloud Platform products. Use these solutions as a reference for your own or extend them to fit your particular use case.
- BigQuery Audit Log - Solution to help audit BigQuery usage using Data Studio for visualization and a sample SQL script to query the back-end data source consisting of audit logs.
- BigQuery Cross Project Slot Monitoring - Solution to help monitoring slot utilization across multiple projects, while breaking down allocation per project.
- BigQuery Group Sync For Row Level Access - Sample code to synchronize group membership from G Suite/Cloud Identity into BigQuery and join that with your data to control access at row level.
- BigQuery Pipeline Utility - Python utility class for defining data pipelines in BigQuery.
- Bigtable Dataflow Cyptocurrencies Exchange RealTime Example - Apache Beam example that reads from the Crypto Exchanges WebSocket API as Google Cloud Dataflow pipeline and saves the feed in Google Cloud Bigtable. Real time visualization and query examples from GCP Bigtable running on Flask server are included.
- Cloud Composer Examples - Examples of using Cloud Composer, GCP's managed Apache Airflow service.
- Cloud SQL Custom Metric - An example of creating a Stackdriver custom metric monitoring Cloud SQL Private Services IP consumption.
- CloudML Bank Marketing - Notebook for creating a classification model for marketing using CloudML.
- CloudML Bee Health Detection - Detect if a bee is unhealthy based on an image of it and its subspecies.
- CloudML Energy Price Forecasting - Predicting the future energy price based on historical price and weather.
- CloudML Fraud Detection - Fraud detection model for credit-cards transactions.
- CloudML Sentiment Analysis - Sentiment analysis for movie reviews using TensorFlow
RNNEstimator
. - CloudML Scikit-learn Pipeline - This is a example for building a scikit-learn-based machine learning pipeline trainer that can be run on AI Platform. The pipeline can be trained locally or remotely on AI platform. The trained model can be further deployed on AI platform to serve online traffic.
- CloudML TensorFlow Profiling - TensorFlow profiling examples for training models with CloudML
- Data Generator - Generate random data with a custom schema at scale for integration tests or demos.
- Dataflow BigQuery Transpose Example - An example pipeline to transpose/pivot/rotate a BigQuery table.
- Dataflow Elasticsearch Indexer - An example pipeline that demonstrates the process of reading JSON documents from Cloud Pub/Sub, enhancing the document using metadata stored in Cloud Bigtable and indexing those documents into Elasticsearch.
- Dataflow Python Examples - Various ETL examples using the Dataflow Python SDK.
- Dataflow Scala Example: Kafka2Avro - Example to read objects from Kafka, and persist them encoded in Avro in Google Cloud Storage, using Dataflow with SCIO.
- Dataflow Streaming Benchmark - Utility to publish randomized fake JSON messages to a Cloud Pub/Sub topic at a configured QPS.
- Dataproc Persistent History Server for Ephemeral Clusters - Example of writing logs from an ephemeral cluster to GCS and using a separate single node cluster to look at Spark and YARN History UIs.
- Dataflow Template Pipelines - Pre-implemented Dataflow template pipelines for solving common data tasks on Google Cloud Platform.
- DLP API Examples - Examples of the DLP API usage.
- GCE Access to Google AdminSDK - Example to help manage access to Google's AdminSDK using GCE's service account identity
- Home Appliance Status Monitoring from Smart Power Readings - An end-to-end demo system featuring a suite of Google Cloud Platform products such as IoT Core, ML Engine, BigQuery, etc.
- IoT Nirvana - An end-to-end Internet of Things architecture running on Google Cloud Platform.
- Kubeflow Pipelines Sentiment Analysis - Create a Kubeflow Pipelines component and pipelines to analyze sentiment for New York Times front page headlines using Cloud Dataflow (Apache Beam Java) and Cloud Natural Language API.
- Kubeflow Fairing Example - Provided three notebooks to demonstrate the usage of Kubeflow Faring to train machine learning jobs (Scikit-Learn, XGBoost, Tensorflow) locally or in the Cloud (AI platform training or Kubeflow cluster).
- Pub/Sub Client Batching Example - Batching in Pub/Sub's Java client API.
- QAOA - Examples of parsing a max-SAT problem in a proprietary format.
- Redis Cluster on GKE Example - Deploying Redis cluster on GKE.
- Spinnaker - Example pipelines for a Canary / Production deployment process.
- Uploading files directly to Google Cloud Storage by using Signed URL - Example architecture to enable uploading files directly to GCS by using Signed URL.
The tools folder contains ready-made utilities which can simpilfy Google Cloud Platform usage.
- AssetInventory - Import Cloud Asset Inventory resourcs into BigQuery.
- BigQuery Discount Per-Project Attribution - A tool that automates the generation of a BigQuery table that uses existing exported billing data, by attributing both CUD and SUD charges on a per-project basis.
- BigQuery Query Plan Exporter - Command line utility for exporting BigQuery query plans in a given date range.
- BigQuery Visualizer - A web application which provides the ability to visualise the execution stages of BigQuery query plans to aid in the optimization of queries.
- CloudConnect - A package that automates the setup of dual VPN tunnels between AWS and GCP.
- Cloudera Parcel GCS Connector - This script helps you create a Cloudera parcel that includes Google Cloud Storage connector. The parcel can be deployed on a Cloudera managed cluster. This script helps you create a Cloudera parcel that includes Google Cloud Storage connector. The parcel can be deployed on a Cloudera managed cluster.
- Dataflow Throttling - A library that can be used to limit the number of requests from Dataflow to an external service. It buffers requests to not overload the external service called by Dataflow pipeline and also activates when the service starts rejecting requests due to out of quota errors.
- DNS Sync - Sync a Cloud DNS zone with GCE resources. Instances and load balancers are added to the cloud DNS zone as they start from compute_engine_activity log events sent from a pub/sub push subscription. Can sync multiple projects to a single Cloud DNS zone.
- GCE Quota Sync - A tool that fetches resource quota usage from the GCE API and synchronizes it to Stackdriver as a custom metric, where it can be used to define automated alerts.
- GCP Architecture Visualizer - A tool that takes CSV output from a Forseti Inventory scan and draws out a dynamic hierarchical tree diagram of org -> folders -> projects -> gcp_resources using the D3.js javascript library.
- GCS Bucket Mover - A tool to move user's bucket, including objects, metadata, and ACL, from one project to another.
- GKE Billing Export - Google Kubernetes Engine fine grained billing export.
- GSuite Exporter - A Python package that automates syncing Admin SDK APIs activity reports to a GCP destination. The module takes entries from the chosen Admin SDK API, converts them into the appropriate format for the destination, and exports them to a destination (e.g: Stackdriver Logging).
- Hive to BigQuery - A Python framework to migrate Hive table to BigQuery using Cloud SQL to keep track of the migration progress.
- LabelMaker - A tool that reads key:value pairs from a json file and labels the running instance and all attached drives accordingly.
- Maven Archetype Dataflow - A maven archetype which bootstraps a Dataflow project with common plugins pre-configured to help maintain high code quality.
- Netblock Monitor - An Apps Script project that will automatically provide email notifications when changes are made to Google’s IP ranges.
- Site Verification Group Sync - A tool to provision "verified owner" permissions (to create GCS buckets with custom dns) based on membership of a Google Group.
- Agile Machine Learning API - A web application which provides the ability to train and deploy ML models on Google Cloud Machine Learning Engine, and visualize the predicted results using LIME through simple post request.
See the contributing instructions to get started contributing.
All solutions within this repository are provided under the Apache 2.0 license. Please see the LICENSE file for more detailed terms and conditions.
This repository and its contents are not an official Google Product.
Questions, issues, and comments should be directed to [email protected].