Understanding Helm Templates and Utilizing YQ for YAML Parsing Mastery

This week I've been finishing up a task that involves updating our CI/CD pipeline to make use of a feature available in the newest release of the Open Telemetry Collector to validate configuration files. In doing so, I got a chance to get up close and personal with Helm templating and see how the sausage (a generated K8s manifest) was made. In doing so, I also learned a cool trick with the tool yq which allowed me to extract a specific set of data from a YAML file.

Come along with me and learn how I learned to decipher the flow of data and variables in the Helm chart, and learn how I used yq to extract exactly the data I need from a pile of YAML.


Background

My team runs a couple of open telemetry collectors (OTCs) to ingest a bunch of telemetry.

  1. These OTCs read their config from a configuration file before starting up and doing their job of collecting whatever they're meant to collect.

  2. The configuration files are built (per environment) and passed in at deploy time by the CI/CD pipeline as the last step.

    1. The configuration files are built by Helm and are passed in as a Helm chart.

Open Telemetry Collector Configuration files look like this:

However, a smol hitch with this setup when used with Helm templating is that (1) the configuration files passed in at deploy-time are specific for the environment (i.e. staging might have a different config than public) and (2) we don't catch a bad config until it's too late (when it's deploy time).

Goal

To update our CI/CD pipeline so that we can validate our configuration files to catch errors early, preferably on a PR whenever the configuration files (or underlying template files) are changed.

Support

One thing that helps us on this quest was the (at the time) pending v0.80 release of the Open Telemetry Collector which would bring support for the validate command to validate configuration files.

In v0.80 - you could provide a config file to the validate command like so:

Which will output a zero exit code (exit successfully) if the config file is valid.

Mission

Update the CI/CD pipeline to use the validate command and catch bad configuration files early. This could be done in two places:

  1. PR Builder on Code Changes

  2. CD Pipeline on Infrastructure Deploys.

Problem:

  1. While it's easy to use the validate command to check a single - there turns out to be a lot of configuration files that we need to check. Think the number of environments * number of zones per environment and we're up to about 12 configuration files.

  2. The configuration files are dynamically built using Helm as part of a Helm chart. This means (1) we need to validate using the same config that's provided at runtime and (2) we need to generate those files for the validate command as they don't exist on disk.

A Quick Refresher

An Open Telemetry Collector config file simply looks like this:

We just need to get the config file(s) so we can use it with the validate command for the Open Telemetry Collector binary - to determine if the config is good or not:

Which exits with a non-zero error code if the config file is valid ☑️

Okay, so let's get these Config Files

Well, since they're generated via Helm at runtime, we will need to use helm template command to generate them.

This is where I was at the edge of my comprehension - because I didn't understand how all the values were passed in and through the template. So I went on a side quest to learn how it all fits together.

It's all about the flow

With regards to helm charts and templating, the data flows like this:

  1. Environment-specific values from the environmental yaml files (i.e. staging.yaml) are fed into the Helm chart (i.e. templates/*.yaml)

    1. templates/*.yaml also makes use of the zonalbaseconfig.yaml file.
  2. Helm then generates / output a set of YAML that can be fed to Kubernetes (a manifest) to apply.

The environment files may look something like this:

Which then feeds into the template files. Specifically, templates/collector.yaml which defines the manifest for each collector.

And since our staging.yaml doesn't specify a config: block, the Helm template will make use of what's in the $baseConfig variable as the template. Note that we know from earlier up, $baseConfig reads from zonalbaseconfig.yaml which looks like this:

If you've been following the bouncing ball, you'll notice that the flow looks like this:

Now that we understand the flow, we know that the template that generates these files is templates/collector.yaml.

And so, we can see what Helm generates out by running:

helm template -f staging.yaml -s templates/collector.yaml . > templated.yaml

Which gives us something like this:

You'll notice that these are the entire manifests for all the zonal collectors we've specified in the environment file (i.e. staging.yaml) which means in addition to the config block - it also includes:

  • The Service Accounts that our template probably generates further down in collector.yaml; and

  • The rest of the manifest which we don't need (like the metadata etc).

So now we're in a better state - we have the data we need plus extra, but how do we refine it to get just the config blocks?

How do I get just the config files?

This is where yq comes in handy.

YQ is a powerful command-line tool for processing YAML files, similar to how JQ works with JSON. It allows users to query, filter, and manipulate YAML data easily and efficiently. With YQ, you can extract specific data, transform the structure, and even merge multiple YAML files.

Since the helm template command outputs a single yaml document containing multiple files (you can see this with the --- separator); I essentially need to do the following:

  • In the whole output of files, grab the files that contain an internal value of kind: OpenTelemetryCollector

    • For each of those files, grab only the spec.config block. (These are what gets read in as config files at runtime).
  • After grabbing all the config blocks, output them onto disk with some sort of special config so that we can then pass them into the validate command.

After much trial and error, behold - the magic 1-liner command that allowed me to do all this:

helm template -f staging.yaml -s templates/collector.yaml . | yq 'select(.kind == "OpenTelemetryCollector").spec.config' -s '"staging-config-" + $index'

Where I now have staging-config-0.yml and staging-config-1.yml in my directory, that look like this:

Next Steps

Now, since I have all my config files available as staging-config-1|2|3|4.yaml - I can easily feed that into the validate command of my Open Telemetry Collector.

The next steps will be updating our CI/CD steps to do so, which will probably be the easier portion of this task.

Conclusion

In conclusion, using Helm templates and YQ can greatly improve the process of generating and validating Open Telemetry Collector configuration files in a CI/CD pipeline. By understanding the data flow in my Helm templates, and by leveraging YQ's powerful querying capabilities, I was able to extract the necessary config files and use the validate command to catch errors early, ensuring a smoother deployment process.