Google Cloud Platform handbook for enthusiasts

Fotios Tragopoulos

Published Jun 24, 2020

One of the most difficult decisions for a Cloud Engineer is to decide which is the most appropriate cloud provider for a project. Which one can offer the best value for money according to a specific region or worldwide? One of the main providers that should be considered before making a decision is GCP. It was launched 12 years ago on April 7, 2008, and runs on the same infrastructure that Google uses internally for its end-user products, such as Gmail or YouTube. I hope this article will help you to understand some basic concepts of GCP and guide you to have a first taste of the platform.

Computing Services

Compute Engine

It is a service that allows users to create VMs, attach persistent storage to those VMs, and make use of other GCP services, such as Cloud Storage.

When you create an instance, you can specify some parameters like:

The operating system
Size of persistent storage
Adding graphical processing units (GPUs) for compute-intensive operations like machine learning
Making the VM preemptible (your VM could be shut down at any time by Google and the cost is up to 80% less. It will frequently be shut down if the preemptible VM has run for at least 24 hours. If they terminate within 10' from starting you will not be charged for that time. They cannot migrate to a regular VM, cannot be set to automatically restart and are not covered by an SLA)

When you create a VM you may consider several factors:

Cost, which varies between regions
Data locality regulations
High availability
Latency
Need for specific hardware platforms

To create a Compute Engine, users must be team members on the project and have appropriate permissions. Users can be associated with projects as Individual Users, Google Group, GSuite Domain, Service Account. You can then use predefined roles to assign permissions that are used for specific tasks like Compute Engine Admin, Compute Engine Network Admin, Compute Engine Security Admin, Compute Engine Viewer etc.

Compute Engine has more than 25 predefined VM types and it is a good option when you need maximum control over VM instances with Compute Engine.

To create an instance using the Cloud SDK (e.g. Cloud Shell) the command is:

$ gcloud compute instances create instance-1 --zone eu-central1-a

To list the virtual machines you created, use:

$ gcloud compute instances list

The --filter="zone:ZONE" parameter will display only the VMs in the specific zone and the --limit will limit the number of VMs displayed. The --sort-by is used to reduce the results by specifying a resource field.

To view your project information use:

$ gcloud compute project-info describe

You can get a list of zones with the following command:

$ gcloud compute zones list

The following list of some important commands will help you work with instances.

--account specifies a GCP account to use overriding the default account.
--configuration uses a named configuration file that contains key-value pairs.
--flatten generates separate key-value records when a key has multiple values. Managing Single Virtual Machine Instances 131
--format specifies an output format, such as a default (human-readable) CSV, JSON, YAML, text, or other possible options.
--help displays a detailed help message.
--project specifies a GCP project to use, overriding the default project.
--quiet disables interactive prompts and uses defaults.
--verbosity specifies the level of detailed output messages. Options are: debug, info, warning, and error.
--zone parameter. We assume a default zone was set when you ran gcloud init.

To start, stop or delete an instance the syntax is:

gcloud compute instances start _INSTANCE_NAMES_

When an instance is deleted, the disks on the VM may be deleted or saved by using the --delete-disks and --keep-disksparameters.

The --keepdisks=all parameter will delete all disks and the --deletedisks=data will delete all non-boot disks.

The --async or --verbose parameter will display information

Snapshots and Images

Snapshots are copies of data on a persistent disk which allows you to restore it. When you create a snapshot make sure that data has been transferred from the memory to the disk. To work with snapshots you need the Compute Storage Admin IAM role.

Snapshots are used to make data available on a disk, while images are used to create VMs.

To create a snapshot of a disk use:

$ gcloud compute disks snapshot _DISK_NAME_ --snapshot-names=_NAME_

To list the snapshots use:

$ gcloud compute snapshots list

and for details on a snapshot use:

$ gcloud compute snapshots describe _SNAPSHOT_NAME_

To create a disk use:

$ gcloud compute disks create _DISK_NAME_ --source-snapshot=_SNAPSHOT_NAME_

The parameters --size and --type can also be used for the size and type of the disk.

To create or delete a custom image for a VM use:

$ gcloud compute images create _IMAGE_NAME_

You can use one of the following parameters for the source:

--source-disk
--source-image
--source-image-family
--source-snapshot
--source-uri

To export an image to Cloud Storage use:

$ gcloud compute images export --destination-uri _DESTINATION_URI_ --image _IMAGE_NAME_

Instance Groups

It is a set of VMs that are managed as a single entity. Any gcloud command is applied to all members of the group. There are two types:

Managed - contain identical VMs. They are created using a template which describes machine type, boot disk image, zone, labels, and other properties of an instance. It can scale automatically and if an instance crashes can be recreated automatically. Also, a load balancer can be used to distribute workloads across the group.
Unmanaged - should be used when you need to work with different configurations.

To create or delete an instance group template use:

$ gcloud compute instance-templates create _INSTANCE_

With the --source-instance parameter you can use an existing VM as the source.

Instance Groups can contain instances in a single zone(zonal) or across a region(regional).

Kubernetes Engine

It is designed to allow users to run containerized applications on a cluster of servers. It allows the users to describe the compute, storage and memory resources and Kubernetes Engine provisions the underlying resources. In addition, it monitors the health of servers in the cluster and automatically repairs problems such as failed servers. It supports autoscaling. With Kubernetes you are able to:

Create Clusters of VMs that run the Kubernetes orchestration software for containers
Deploy containerized applications to the cluster
Administer the cluster
Specify policies, such as autoscaling
Monitor cluster health

Kubernetes Engine provides the following functions:

Load balancing across Compute Engine VMs that are deployed in a Kubernetes cluster
Automatic scaling of nodes (VMs) in the cluster
Automatic upgrading of cluster software as needed
Node monitoring and health repair
Logging
Support for node pools, which are collections of nodes all with the same configuration

A Kubernetes cluster includes a cluster master node which contains the Kubernetes API Server, the resource controllers and the schedulers. It also contains one or more worker nodes, these are Compute Engine VMs, which they contain the pods. Containers within a single pod share storage and network resources. Containers within a pod share an IP address and port space.

Kubernetes maintains cluster health by shutting down pods that they ask a lot of resources and by running multiple identical pods (replicas). When a group of identical pods (deployment) is rolled out, it can be in one of three states:

Progressing - in the process of performing a task
Completed - all pods are running the latest version of containers
Failed - it cannot recover from the problem encountered

Basic Concepts

Pods - single instances of a running process in a cluster and they contain at least one container. Multiple containers are used when they need to share resources. Pods use shared networking and storage across containers. Each pod gets a unique IP and set of ports. Containers connect to a port. Multiple containers in a pod connect to different ports and can talk to each other on localhost. Pods are considered ephemeral, they are expected to terminate if a pod is unhealthy, stuck or crashing. The Controller manages scaling and health monitoring.
Services - is an object that provides API endpoints with a stable IP address that allow applications to discover pods running a particular application. Services update when changes are made to pods, so they maintain an up-to-date list of pods running an application.
ReplicaSet - is a controller used by a deployment that ensures that the correct number of identical pods are running. If a pod is terminated by a controller because it was unhealthy then ReplicaSet will create another.
Deployment - is a set of identical pods. The pods all running the same application because they are created using the same pod template.
StatefulSet - allows to have a single pod respond to all calls for a client during a single session. It is assigning a unique identifier to pods so Kubernetes can keep track which pod is used by which client. It is used when an application needs a unique network or persistent storage.
Job - are workloads. They create pods and run them until the application completes. Job specifications are specified in a configuration file and include specifications about the container to use and what command to run.

App Engine

It is a compute PaaS that users don't need to configure the VMs or specifying Kubernetes clusters. Instead, you develop in a popular programming language and deploy that code to a serverless environment. Each Project can have only one App Engine. It is available in two types:

Standard: You run applications in a language-specific sandbox which is isolated from the OS or other applications running on this server.
Flexible: You run Docker containers in the App Engine environment so you can run also libraries or third-party software. It also gives you the options to work with background processes and write to local disk.

App Engine applications consist of an application, a service, a version and an instance. An application has at least one service, which is the code executed in the App Engine environment. Under the Service, there is a versioning system that holds all the versions of your application. When a Version executes it creates an instance of the app. Services are typically structured to perform a single function and complex applications are made of several microservices.

The instances can be configured to be dynamic or resident. Dynamic is optimized to decrease cost and the resident for performance. If you configure manual scaling the instances will be resident. To specify automatic scaling, add a section to app.yaml that includes the term automatic_scaling followed by any of the following key-value pairs of configuration options

target_cpu_utilization - Specifies the maximum CPU utilization
target_throughput_utilization - Specifies the maximum number of concurrent requests. This is specified as a number between 0.5 and 0.95
max_concurrent_requests - Specifies the max concurrent requests an instance can accept. The default is 10 and the max is 80
max_instances and min_instances - Indicates the range of number of instances that can run
max_pending_latency and min_pending_latency - Indicates the time a request will wait in the queue to be processed

With App Engine you can split traffic by IP address (where a client is routed always to the same split), HTTP cookie (which allows assigning users to versions) or randomly (for evenly distribution of workload). When you use a cookie, the HTTP request header for a cookie named GOOGAPPUID contains a hash value between 0 and 999. If there is no GOOGAPPUID cookie, then the traffic is routed randomly.

To split traffic between v1 and v2 at 50% use:

$ gcloud app services set-traffic _SERVICE_NAME_ --splits v1=.5,v2=.5

Cloud Functions

It is a lightweight computing option for event-driven processing. It supports Node.js, Python, Go and Java. The functions execute in a secure, isolated environment. They support autoscaling and they are code independent and stateless. An event is an action in GCP, such as a file upload to the Cloud Storage. GCP supports events in Cloud Storage (upload, delete or archive), Cloud Pub/Sub (it has an event for publishing a message), HTTP (by calling a specific function), Firebase (in the Firebase DB by using triggers) and Logging (by forwarding logs to a Pub/Sub topic). For every Cloud Function, you can define a trigger and triggers are associated with functions.

Storage Services

Cloud Storage

It is an object storage system. You can store any type of file or binary large object in a bucket. It is not a file system but a service that receives, stores and retrieves files or objects from a distributed storage system. Each stored object is uniquely addressable by a URL. You can apply permissions to read and write objects to a bucket. It offers also a cold storage which is low-cost archival storage designed for high durability and infrequent access. Some limitations of the Cloud Storage are that it doesn't provide the functionality to manipulate subcomponents of a file. You cannot overwrite a file for example. Also, it doesn't support concurrency and locking. The last data written to a file is stored and persisted. Cloud Storage provides four different classes of object storage:

Multi-regional - stores your data to multiple regions. The main benefit is that it provides redundancy in case of zone-level failures. It is also known as Georedundant.
Regional - stores your data to a single region but it is cheaper than the multi-regional.
Nearline - is a storage that is expected to be accessed less than once per month.
Coldline - is a storage that is expected to be accessed less than once per year.

Buckets in Cloud Storage can be configured to keep versions of objects.

Persistent Disk

It is a storage service that is attached to VMs in Compute Engine or Kubernetes Engine. It provides block storage on SSD or HDD so you can create a filesystem. The advantage they offer is that they support multiple readers without impact on performance. They can also be resized without the need to restart the VM. They have a 64TB limit. Persistent disks can be regional or zonal. Regional is more expensive but will copy your disks to two different zones within a region.

Cloud Storage for Firebase

It is the best combination of cloud object storage and support for unreliable network connections like mobile devices. It provides secure transmission and robust recovery mechanisms to handle network quality.

Cloud Filestore

It provides a shared file system for use with Compute and Kubernetes Engine. It has high IOPS as well as variable storage capacity that can be configured. It implements the NFS protocol.

Databases

Cloud SQL

It is a service which allows the user to set up SQL Server, MySQL or PostgreSQL on VMs without the need to administer it, like backing up or patching DB software. It includes management of replication and automatic failover. Cloud SQL supports transactions.

Cloud Spanner

It is a globally distributed relational database which combines the benefits of RDBs like transactions and strong consistency with the ability to scale horizontally. It also has enterprise-grade security with encryption at rest and in transit, along with identity-based access controls. It supports ANSI 2011 standard SQL. Cloud Spanner supports transactions.

Cloud Bigtable

It is designed for petabyte-scale applications with billions of rows and thousands of columns. It is based on a NoSQL model known as a wide-column data model. It is suited for applications that require low-latency to write and read operations.

Cloud Datastore

It is a NoSQL document DB which can be accessed via a REST API that can be used from applications running in Compute, App or Kubernetes Engine. It will also shard or partition data to maintain performance. It also takes care of replication, backups and other administrative tasks. It supports transactions, indexes and SQL-like queries.

Cloud Firestore

It is a managed NoSQL DB design for highly scalable mobile and web applications. It provides offline support, synchronization and other features for mobile and web apps. It also works with Firebase.

Cloud Memorystore

It is an in-memory cache service, a managed Redis service for caching frequently used data in memory. It can be used to reduce the time an application needs to read data. When you use MemoryStore you create instances that run Redis with 1GB - 300GB of memory.

Networking

Virtual Private Cloud (VPC)

A VPC in GCP can span the globe without relying on the public Internet. Traffic from any server on a VPC can be securely routed through Google to any other network. Also your backend servers can access services in GCP without a public IP. VPC can be linked to on-premises VPC using IPSec. It is a global resource and not tied to a specific region or zone. They contain subnets (short for subnetworks) which are regional resources.

Firewall

Firewall rules are defined at the network level and used to control the flow of network traffic to VMs. It can be incoming (ingress) or outgoing (egress) or to a specific port. A firewall is stateful during an active connection (at least one packet exchange every ten minutes). Firewall rules consist of several components:

Direction - ingress or egress
Priority - highest-priority rules are applied and it is specified by an integer from 0 (highest) to 65535 (lowest)
Action - can be either allow or deny
Target - is an instance to which the rule applies and it can be all instances in a network
Source/destination - source applies to ingress rules and specifies source IP ranges. The IP address 0.0.0.0/0 indicates any IP address. The destination parameter uses only IP ranges.
Protocol and port - such as TCP, UDP, or ICMP and a port number. If no protocol is specified, then the rule applies to all protocols.
Enforcement status - either enabled or disabled. Disabled rules are not applied even if they match.

To create firewall rules using the command line use:

$ gcloud compute firewall-rules

the parameters you can add are --action --allow --description --destination-ranges --direction --network --priority --source-ranges --source-service-accounts --source-tags --target-service-accounts --target-tags

CIDR Blocks

CIDR Blocks define a range of IP addresses that are available in a subnet. To increase the number (for example if you expand the size of clusters in a subnet) use:

$ gcloud compute networks subnets expand-ip-range _NAME_OF_SUBNET_ --prefix 12

Cloud Load Balancing

It can distribute the workload within and across regions using a single IP. It is a software service that can load-balance HTTP, HTTPS, TCP/SSL and UDP traffic. They are characterized by three main features:

Global for globally distributed applications and regional load balancing within a region.
External versus internal load balancing
Traffic types, such as HTTP and TCP

Cloud Armor

It is network security that builds on the Global Load Balancing service. It provides the ability to allow or restrict access based on IP, protects against XSS attacks, SQL injection, allows and restricts access based on the geolocation of incoming traffic.

Cloud CDN

Content Delivery Networks enable a low-latency response to content requests by caching content on a set of endpoints across the globe.

Cloud Interconnect

It is a set of services for connecting your existing network to Google's network. It offers two types of connection, interconnect and peering.

Cloud DNS

It is a domain name service with high availability and low-latency.

Management Tools

Stackdriver

This is a service that collects metrics, logs, and event data from applications and infrastructure and integrates the data so DevOps engineers can monitor, assess, and diagnose operational problems.

Monitoring

This extends the capabilities of Stackdriver by collecting performance data from GCP, AWS resources, and application instrumentation, including popular open-source systems like NGINX, Cassandra, and Elasticsearch.

Logging

This service enables users to store and analyze and alert on log data from both GCP and AWS logs.

Error Reporting

This aggregates application crash information for display in a centralized interface.

Trace

This is a distributed tracing service that captures latency data about an application to help identify performance problem areas.

Debugger

This enables developers to inspect the state of executing code, inject commands, and view call stack variables.

Profiler

This is used to collect CPU and memory utilization information across the call hierarchy of an application. The Profiler uses statistical sampling to minimize the impact of profiling on application performance.

Specialized Services

Apigee

It is a management service for API access to their applications. You can deploy, monitor and secure your APIs and it generates API proxies based on the Open API Specification. It provides configurable routing and rate-limiting. Data are encrypted at rest and on transit and APIs can be authenticated using either OAuth 2.0 or SAML.

Data Analytics

BigQuery

It is a petabyte-scale analytics database service for data warehousing

Cloud Dataflow

A framework for defining batch and stream processing pipelines

Cloud Dataproc

It is a managed Hadoop and Spark service. It is designed for data manipulation, statistical analysis, machine learning and other complex operations.

Cloud Dataprep

A service that allows analysts to explore and prepare data for analysis

Artificial Intelligence and Machine Learning

Cloud AutoML

This is a tool that allows developers without machine learning experience to develop machine learning models.

Cloud Machine Learning Engine

This is a platform for building and deploying scalable machine learning systems to production.

Cloud Natural Language Processing

This tool is for analyzing human languages and extracting information from text.

Cloud Vision

This is an image analysis platform for annotating images with metadata, extracting text, or filtering content.

Resource Hierarchy

It is a way to group resources and manages them as a single unit. It consists of three different levels:

Organization - the root of the hierarchy. Single cloud identity is associated with one organization and has super admins. Those super admins assign the role of Organization Administrator Identity and Access Management (IAM) role to users who manage the organization. Also, a Project Creator and a Billing Account Creator role will be created automatically. The Organization Administrator (IAM) role is responsible for defining the structure of the resource hierarchy, defining IAM policies over the resource hierarchy, delegating other management roles.
Folder - organizations contains folders. Folders contain other folders or projects.
Project - in projects we create resources, use services, manage permissions and billing options. Anyone with the resourcemanager.projects.create IAM permission can create a project. By default, every user in an organization has this permission. There is a quota of projects an organization can create and it is defined by Google. If you reach this quota you need to apply for more projects.

Organization Policies

Organization Policy Service controls access to an organization's resources. It lets you assign permissions to users or roles and specify limits on the ways resources can be used. IAM specifies who can do things and the Organization Policy Service specifies what can be done.

Identity Management

It enables customers to define fine-grained access controls on resources. It uses the concepts of users, roles and privileges. Identities are abstractions about users of services. Roles are sets of permissions that can be assigned to an identity.

Roles and Identities

A role is a collection of permissions. An identity is a record in GCP to represent a user. There are three types of roles:

Primitive - those are Owner, Editor and Viewer.
Predefined - they provide granular access to resources and they are specific to GCP products
Custom - allow administrators to create their own roles

Permissions can be assigned only to roles and roles can be assigned to identities.

Service Accounts

A Service Account is an application or VM that act on behalf of the user. They can be treated as users and as resources. There are two types. The user-managed and the Google-managed. Users can create up to 100 service accounts per project. When you create a Compute or App Engine a Compute or App Engine Service Account is created automatically and granted the Editor Role.

Predefined roles are grouped by service, App Engine, for example, has five roles:

App Engine Admin - which grants read, write and modify permission to application and configuration settings
App Engine Service Admin - which grants read-only access to configuration settings and write access to module-level and version-level settings.
App Engine Deployer - which grants read-only access to application configuration and settings and writes access to create new versions. Users with only the App Engine Deployer role cannot modify or delete existing versions
App Engine Viewer - which grants read-only access to application configuration and settings
App Engine Code Viewer - which grants read-only access to all application configurations, settings, and deployed source code

Billing Accounts

They store information about how to pay charges for resources used. A billing account is associated with one or more projects. There are two types of billing accounts

Self-serve which are paid by credit card or direct debit from a bank account and the costs are charged automatically
Invoiced billing account where bills or invoices are sent to customers.

The billing roles are:

Billing Account Creator - which can create new self-service billing accounts
Billing Account Administrator - which manages billing accounts but cannot create them
Billing Account User - which enables a user to link projects to billing accounts
Billing Account Viewer - which enables a user to view billing account cost and transactions

The billing section also allows you to create budget and alerts. By default, three alerts are created when you create a budget (50%, 90% and 100%). You can export your billing data either in a BigQuery or in a Cloud Storage file (CSV or JSON).

I hope you will find this article useful. Thanks for reading it, and share if you liked it!

Antonis Mladenis

Nice job Fotios Tragopoulos

To view or add a comment, sign in

Computing Services

Compute Engine

Kubernetes Engine

App Engine

Cloud Functions

Storage Services

Cloud Storage

Persistent Disk

Cloud Storage for Firebase

Cloud Filestore

Databases

Cloud SQL

Cloud Spanner

Cloud Bigtable

Cloud Datastore

Cloud Firestore

Cloud Memorystore

Networking

Virtual Private Cloud (VPC)

Cloud Load Balancing

Cloud Armor

Cloud CDN

Cloud Interconnect

Cloud DNS

Management Tools

Stackdriver

Monitoring

Logging

Error Reporting

Trace

Debugger

Profiler

Specialized Services

Apigee

Data Analytics

BigQuery

Cloud Dataflow

Cloud Dataproc

Cloud Dataprep

Artificial Intelligence and Machine Learning

Cloud AutoML

Cloud Machine Learning Engine

Cloud Natural Language Processing

Cloud Vision

Resource Hierarchy

Organization Policies

Identity Management

Roles and Identities

Service Accounts

Billing Accounts

More articles by Fotios Tragopoulos

Integration Scenarios

From Silos to Synergy: Unleashing the Power of Integration Platforms in Europe

AWS Service List

An Overview of AWS and CLF-C01 Exams

Azure for Developers (AZ-204)

Azure DevOps

Azure AZ-104 Preparation Guide

Kubernetes the Imperative and Declarative way

Javascript Design Patterns

Azure AZ-900 Preparation Guide

Others also viewed

AZURE Cloud Monthly Updates Newsletter – August 2024.

From Waste to Value: Optimizing Cloud Workloads—Whether You’re on One, Two, or Many

AZURE Cloud Monthly Updates Newsletter –January 2025.

The Decline in AWS Dominance: Why Azure Is Gaining Ground—and What It Means for DevOps

Google Cloud vs AWS: Finding the Perfect Fit for Your Cloud Computing Needs

Cloud Computing in a 🥜🐚: The Definitive Elevator Pitch

Which platform is the best, AWS, Azure, or Google Cloud?

GOOGLE CLOUD PLATFORM

What is Google Cloud Platform ?

Perform Instant File-level Restores using Zerto & Alibaba Cloud OSS

Explore content categories