Regional endpoints

Dataproc supports regional endpoints based on Compute Engine regions. You must specify a region, such as "us-east1" or "europe-west1", when you create a Dataproc cluster. Dataproc will isolate cluster resources, such as VM instances and Cloud Storage and metadata storage, within a zone within the specified region.

You can optionally specify a zone within the specified cluster region, such as "us-east1-a" or "europe-west1-b", when you create a cluster. If you do not specify the zone, Dataproc Auto Zone Placement will choose a zone within your specified cluster region to locate clusters resources.

The regional namespace corresponds to the /regions/REGION segment of Dataproc resource URIs (see, for example, the cluster networkUri).

Regional endpoint semantics

Regional endpoint names follow a standard naming convention based on Compute Engine regions. For example, the name for the Central US region is us-central1, and the name of the Western Europe region is europe-west1. Run the gcloud compute regions list command to see a listing of available regions.

Create a cluster

gcloud

When you create a cluster, specify a region using the required --region flag.

 gcloud dataproc clusters create CLUSTER_NAME \     --region=REGION \     other args ... 

REST API

Use the REGION URL parameter in a clusters.create request to specify the cluster region.

gRPC

Set the client transport address to the regional endpoint using the following pattern:

REGION-dataproc.googleapis.com

Python (google-cloud-python) example:

from google.cloud import dataproc_v1 from google.cloud.dataproc_v1.gapic.transports import cluster_controller_grpc_transport transport = cluster_controller_grpc_transport.ClusterControllerGrpcTransport( address='us-central1-dataproc.googleapis.com:443') client = dataproc_v1.ClusterControllerClient(transport) project_id = 'my-project' region = 'us-central1' cluster = {...}

Java (google-cloud-java) example:

ClusterControllerSettings settings =  ClusterControllerSettings.newBuilder()  .setEndpoint("us-central1-dataproc.googleapis.com:443")  .build();  try (ClusterControllerClient clusterControllerClient = ClusterControllerClient.create(settings)) {  String projectId = "my-project";  String region = "us-central1";  Cluster cluster = Cluster.newBuilder().build();  Cluster response =  clusterControllerClient.createClusterAsync(projectId, region, cluster).get();  }

Console

Specify a Dataproc region in the Location section of the Set up cluster panel on the Dataproc Create a cluster page in the Google Cloud console.

What's next