-
Notifications
You must be signed in to change notification settings - Fork 32
Fix #34 - Allow Dataflow job to deploy worker instances into Subnet of shared VPC #43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add subnet_complete_url Terraform Variable - Update README to include new variable
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Hi @Lakhtenkov-iv / @rarsan! Please could you advise me on how to get this pull-request reviewed? Cheers |
pipeline.tf
Outdated
@@ -100,7 +100,7 @@ resource "google_dataflow_job" "dataflow_job" { | |||
) | |||
region = var.region | |||
network = var.network | |||
subnetwork = "regions/${var.region}/subnetworks/${local.subnet_name}" | |||
subnetwork = coalesce(var.subnet_complete_url, "regions/${var.region}/subnetworks/${local.subnet_name}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm bit doubt of the logic.
Before change there are 2 variables and 2 scenarios:
subnet_name
create_network
create_network
defines if subnetwork should be created or not,subnet_name
is treated as name of the existing if create is false, otherwise this is the name of subnet to be created.
After the change we have 3 variables:
subnet_name
subnet_complete_url
create_network
Some of them are mutually exclusive likecreate_network = true
andsubnet_complete_url = https://...
. There could be some unexpected behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lakhtenkov-iv Thank you for your prompt reply! I agree with your above comment. I have updated the logic to account for both the possibility where create_network = true
(in which case there is no functional change) and where create_network = false
, in which case we simply reference the existing subnet using the project
, region
& subnet
variables passed into the root module.
However, in the case where we wish to deploy Dataflow workers into a different subnet, I think at least one other variable is required to capture the project/region/subnet name of this new subnet. That way we can use a call to coalesce() within the second branch of the ternary statement to populate with the new subnet region/subnet.
Let me know what you think.
Cheers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @GreggSchofield.
Yes you are right. It is not working for networks form 3rd party projects (Project with shared VPC).
It should work like that:
subnetwork = startswith("https://", local.subnet_name) ? local.subnet_name : "regions/${var.region}/subnetworks/${local.subnet_name}"
@GreggSchofield I'm looking to use this feature as well. Is there any way we can push this forward? |
As documented in #34, when deploying Dataflow workers into a Subnet of a shared VPC, we need to use the complete URL when referencing this Subnet. At present, we have the following Terraform line to pass a subnetwork into the google_dataflow_job resource:
The current logic allows for the referencing of subnetworks using the abbreviated path syntax only which assumes we are deploying the dataflow workers into the same Subnet specified in
sample.tfvars
(where the Pub/Sub & Cloud Monitoring resources will reside). This prevents us from referencing subnets by their complete URL - a requirement to deploy Dataflow workers into a Subnet of a shared VPC.The proposed solution is to introduce a new variable named
subnet_complete_url
which will store the complete URL of a subnetwork if you choose to utilise an existing shared VPC resource. I have opted to use the Terraform built-in coalesce function to preserve backwards compatibility and prevent breaking changes.Don't hesitate to contact me for any questions about this pull-request or suggestions on any improvements.
Cheers!