Labels

aws (3) ci/cd (2) datadog (1) devops (1) iac (6) migration (1) techincal_tag (1) terraform (5) vvvhq (1)

April 25, 2020

Datadog agent - Kubernetes logs processing


Datadog has great monitoring abilities, including log monitoring.

In a case where your monitored resource is Kubernetes cluster, there are a couple of ways to get the logs. The first-class solution is to use Datadog agents for most of the roles.

"Ways" means to deliver the log to Datadog servers.


Why? See..

  • The agent is well compiled
  • Comparing to other log delivery agents like fluentd or logstash, Datadog agents use much fewer resources for their own activity.
  • Logs, metrics, traces, and rest of things that Datadog agent can transmit arrived already "right tagged" and pre-formatted, depends on agent config off cause.
  • And if you are paying for Datadog - support cases with their agent should get a better response than additional integrations where 3rd party services involved to log delivery (like other agents or AWS lambda)

And why it is good to use agent pre-formatting features? Well...

  • Log messages delivered in full format (multi_line)
  • Excluded log patterns can save you Datadog bill and also your log dashboard will be much cleaner.
  • Mask sensitive data
  • Take some static log files from non-default log directories
  • Here you can find a bit more processing rules types
  • And here about more patterns for known integrations. Logs collection integration info usually appears at the end of the specific integration article.
  • I hope this helps to implement Datadog logs collection from Kubernetes apps.
  • So, here is an example of log pre-processing from the nodejs app that runs on Kubernetes. In Datadog Kubernetes daemonset there is an option to pass log processing rule for datadog in the app "annotations".


This means that specific app log processing config defined on the app, and not on datadog agent (which is possible by default).
And this is great in cases where log parsing rule planned change only for a specific app and you don't want to redeploy Datadog daemonset.

Don't forget that all this only pre-processing, main parsing, and all indexing boiled in your Datadog log pipelines.

If your "source" name in logs will be the same name as one of the known integrations - you will see dynamic default pipelines for this integration and ready pipeline rules.

You can just clone them and customize them. See this example below.


I hope this helps to implement Datadog logs collection from Kubernetes apps.



April 24, 2020

Environment Variables in Terraform Cloud Remote Run


It is great to be able to use the output (or state data) from other terraform cloud workspace.

In most of the cases, it will be in the same TFC organization.
But one of the required arguments in this "terraform_remote_state" data object is "organization"... Hmm, this is where I am running just now.
The second required argument is "name" (remote workspace name).
Hmm, what if you are using some workspace name convention or workspace prefixes?

Ok, it looks like it can be done easily.
Like any "CICDaaS" with remote runners, TFC has a unique dynamic (system) environment variables on each run, like "runID", workspace and organization names.
To be sure what variables exist - just run some TF config with local-exec command "printenv" - you will see all Key=Value.

Note that only system variables with "TF_VAR_" prefix are accessible via terraform for you, this has no connection with "terraform providers"  that have pre-compiled specific system variables for their own need (like AWS_DEFAULT_REGION).

Lets back to our case.

So we have two workspaces in TFC, under the same organization:

  1. Workspace where we are bringing up AWS VPC and EKS, MongoDB
  2. Workspace where we will deploy Kubernetes services with helm charts.
To be able to deploy to created Kubernetes cluster (EKS) - "second" workspace must pass Kubernetes authentication first. Also, Kubernetes services should get the "MongoDB_URI" string.

That's why we will call "first" workspace "demo" and "second" we will call "demo-helm".
Then in "second" workspace, an object "terraform_remote_state" must run before the rest of object/resources:



I hope this helps a little to not define these things in variables and to prevent helm deployment on the wrong cluster by its nature.