Dataproc supports regional endpoints based on Compute Engine regions. You must specify a region, such as "us-east1" or "europe-west1", when you create a Dataproc cluster. Dataproc will isolate cluster resources, such as VM instances and Cloud Storage and metadata storage, within a zone within the specified region.
You can optionally specify a zone within the specified cluster region, such as "us-east1-a" or "europe-west1-b", when you create a cluster. If you do not specify the zone, Dataproc Auto Zone Placement will choose a zone within your specified cluster region to locate clusters resources.
The regional namespace corresponds to the /regions/REGION segment of Dataproc resource URIs (see, for example, the cluster networkUri).
Regional endpoint semantics
Regional endpoint names follow a standard naming convention based on Compute Engine regions. For example, the name for the Central US region is us-central1, and the name of the Western Europe region is europe-west1. Run the gcloud compute regions list command to see a listing of available regions.
Create a cluster
gcloud
When you create a cluster, specify a region using the required --region flag.
gcloud dataproc clusters create CLUSTER_NAME \ --region=REGION \ other args ...
REST API
Use the REGION URL parameter in a clusters.create request to specify the cluster region.
gRPC
Set the client transport address to the regional endpoint using the following pattern:
REGION-dataproc.googleapis.com
Python (google-cloud-python) example:
from google.cloud import dataproc_v1 from google.cloud.dataproc_v1.gapic.transports import cluster_controller_grpc_transport transport = cluster_controller_grpc_transport.ClusterControllerGrpcTransport( address='us-central1-dataproc.googleapis.com:443') client = dataproc_v1.ClusterControllerClient(transport) project_id = 'my-project' region = 'us-central1' cluster = {...}Java (google-cloud-java) example:
ClusterControllerSettings settings = ClusterControllerSettings.newBuilder() .setEndpoint("us-central1-dataproc.googleapis.com:443") .build(); try (ClusterControllerClient clusterControllerClient = ClusterControllerClient.create(settings)) { String projectId = "my-project"; String region = "us-central1"; Cluster cluster = Cluster.newBuilder().build(); Cluster response = clusterControllerClient.createClusterAsync(projectId, region, cluster).get(); }Console
Specify a Dataproc region in the Location section of the Set up cluster panel on the Dataproc Create a cluster page in the Google Cloud console.
What's next
- Geography and Regions
- Compute Engine Engine→Regions and Zones
- Compute Engine→Global, Regional, and Zonal Resources
- Dataproc Auto Zone Placement