Terraform is such an underappreciated tool. It seems like so much of the hate su...

majormajor · on June 8, 2021

My problem with this approach is that it's still too much "infrastructure as data" and not "infrastructure as code." Moving infrastructure data into flat files is not a clear-cut win over having it in a database - you get easier version control with external tools like git, but you everything that makes a database a joy to work with instead of flat files, like schema validation and easy queries, etc.

Things like for_each and variables exist because "infrastructure as data" would be incredibly tedious and brittle and hard to extend, but an approach that tries to get to "infrastructure as code" by starting with a data format instead of a programming language just seems like too big a gap to cross. I haven't seen a lot of teams unit testing their terraform, for instance.

hpoe · on June 8, 2021

But at the end of the day your infrastructure is essentially data not code. Your infrastructure is permanent, it exists even if it isn't being used it has inertia. At the end of the day your "infrastructure" is really just an entry in a database of a cloud provider, it is data not code.

I think we are seeing things come full circle again where people are finding the limitations of declarative infrastructure tools and decreeing declarative infrastructure dead and moving back to imperative infrastructure tools like Salt or Ansible.

Does anyone else feel that the infrastructure tooling environment/space is in the same place the JS world was 5 years ago?

throwaway894345 · on June 8, 2021

> But at the end of the day your infrastructure is essentially data not code. Your infrastructure is permanent, it exists even if it isn't being used it has inertia. At the end of the day your "infrastructure" is really just an entry in a database of a cloud provider, it is data not code.

That may well be true, but it doesn't solve the problem (note also that HTML is just data, but we don't typically expect people to copy/paste the same HTML blob for every blog entry they write nor do we expect them to update each of them when they need to make a change):

We often have N very similar, large, complex YAML/HCL/etc objects that we want to manage with Terraform. If we need to make a change to all of them, we have to update N different places. Keeping these in sync is tedious and error prone. So we need to be able to factor out the common code into some reusable unit that accepts the bits that vary as parameters. Terraform's notion of "modules" is a great big acknowledgement of this need, although it's amazing that the whole time they were building this no one thought to themselves "guys, this seems really heavyweight and cumbersome for what ultimately is just a function" (and that general failure to notice that they were accidentally building a fully fledged programming language seems like an apt summary of Terraform's development).

Note also that there's nothing special about infrastructure as code here, this is a general application of the DRY principle.

> I think we are seeing things come full circle again where people are finding the limitations of declarative infrastructure tools and decreeing declarative infrastructure dead and moving back to imperative infrastructure tools like Salt or Ansible.

Just because you're using a programming language doesn't mean you're imperatively updating state. You use a programming language to generate the static configuration (e.g., the YAML) that verbosely describes the desired state of the world that the application engine can then diff against the current state to figure out what changes need to be made. This is sort of what Terraform is doing these days, but by all appearances they didn't realize what they were doing and consequently the programming language they built was predictably awful.

fierro · on June 9, 2021

excellent summary of the problem.

polendri · on June 10, 2021

It sounds like you're suggesting that there's some inherent reason why your infrastructure definition must have the same structure as the output of that definition (the infrastructure). I agree that the infrastructure is state, but it seems obvious to me that sometimes it requires non-trivial computations to decide on the desired state, something which is best served by code.

It's also a false dichotomy IMO that configuration files are the only declarative alternative to imperative tools like Salt/Ansible. You can have declarative code too: my laptop is running NixOS and its system state is defined in code (in a purpose-built language that looks much like a config file).

So really I think there are three approaches, not two, each with upsides and downsides which keep us all ping-ponging between them:

1. Config files are ideal for simple use cases, but a mess for complex ones 2. General-purpose programming languages are completely flexible, but allow you to create a huge unmaintainable mess 3. Dedicated declarative languages constrain you enough to mostly provide the best of both config files and code, but then you have to learn a whole new language, one which was probably conceived hastily (I find the Nix language awful honestly)

Some people need arbitrary computations to define their infrastructure, so I think pure config files are a non-starter from a purist's perspective. But, so far we haven't been able to come up with a programming language for infrastructure that isn't a mess to use.

tehbeard · on June 9, 2021

Side tangent, but I'm curious as to why you list Ansible as imperative, when it seems to be declarative in how you configure a module?

Or is this a case of scope? (At the level of a single ansible module, it's config is declarative, but runbooks/roles are imperative? Is it the variable substitution/loop mechanics that make it imperative?)

randomswede · on June 10, 2021

In Ansible, you declare a set of actions, that are then performed, one by one. Occasionally being skipped, if a certain condition holds true.

So, you are basically saying "do this, do that, then do that".

In a declarative model, you would say "this is how the end result should look" and the tool would then go off and make that happen, in whatever order its scheduling tools would say.

Sort of the difference between Rust on one side, and Prolog on the other (yes, it is possible to get a specific flow of instructions in Prolog, but it is much easier to let the prolog interpreter/compiler to Just Make It Happen Somehow).

FWIW, Puppet gets closer to a declarative model, but unfortunately, the last version I played around with seriously was actually quite bad at inferring ordering on its own, so a LOT of work ended up going into "well, A has to happen before B, so let us string a dependency here".

tehbeard · on June 10, 2021

I guess I need an example of the declarative model, as I can see the Ansible model in my head and it still looks declarative to me atleast at the singular module level.

in Ansible, you say "make sure these packages are installed" and they'll be installed as needed, to match that state, or ignored if already there.

Even the file level stuff you can say "make sure this line is in the file" and it either adds it or says "nope, that's already in there".

Is it that there's modules that aren't declarative? Sort of the esoteric ones to poke specific cloud infrastructure (though even the few of those I looked at seemed to be declarative if needed).

randomswede · on June 11, 2021

It is not declarative. The simple fact that the playbook will ALWAYS be run in the order you specify, even if a later step is (technically) a prerequisite of a previous step, means that you are in an imperative mode.

Puppet is declarative, you simply say "these things must, or must not, hold" and a combination of user-declared and inferred dependencies arrange the sequencing, which can be different in each run (as long as the before/after dependencies hold).

tehbeard · on June 12, 2021

Ah so it's that the playbooks are imperative and "dumb" (does exactly what you tell it, rather than inferring "these actions must happen, do them in a sensible order"). That makes sense.

Dove into the puppet docs/wiki article, I guess part of the difference as well is that puppet considers each "unit" a resource, vs. ansible being a "module/action".

It does seem like ansible roles have a dependency mechanism, I guess that might be the intended level for a "declarative" approach in ansible, to encapsulate the playbooks/modules underneath that are more of an implementation detail at that point.

thayne · on June 8, 2021

It's a lot better than it used to be. But there are still quite a few annoyances. For example, you still need to use count as a hack for the absence of any kind of "if". You can't make custom functions. Modules can be kind of awkward to work with. There are still some places that can't take any dynamic values such as lifecycle.ignore_changes and arguments to providers and backends.

frenchman99 · on June 8, 2021

The `count()` "hack" is so common that it barely qualifies as a hack anymore. It's just common practice and immediately understandable when you read code.

bxbxbuuu · on June 8, 2021

This reminds me of shopify's liquid dsl, a horror to work with, but you can just about make it do what you want, sometimes it feels like writing assembly to do string manipulation if they haven't built a function for your exact scenario.

jchook · on June 8, 2021

I really prefer the Pulumi approach where you define the configuration in your favorite Turing-complete language.

Not sure why Hashicorp felt the need to reinvent the wheel instead of having a library in an existing language generate markup or JSON or something like that.

solatic · on June 8, 2021

The biggest issue with Pulumi is that Pulumi doesn't support adding custom API providers. Part of the power of Terraform is in provisioning infrastructure, orchestration, deployment, and application configuration all in one tool. For example:

(aforementioned GitHub provider)

https://registry.terraform.io/providers/terraform-provider-c... for Concourse (CI/CD)

https://registry.terraform.io/providers/coralogix/coralogix/... (full disclosure: I work for Coralogix)

This would be completely impossible with Pulumi. If Pulumi didn't bless it, it doesn't exist in Pulumi's world. In the meantime, Terraform allows you to separate all the network calls to a custom provider and allow you to just focus on the configuration. The number of paid external APIs is only expanding exponentially, Pulumi can't possibly build and support them all in-house. Sounds like a current limitation of Pulumi's "use any programming language you want" design and something that really needs to be addressed; it's not that writing a custom Terraform provider is easy, but it is quite simple to get started by following any of the bajillion open-source providers as a sample template to get started from.

TrueTeller · on June 8, 2021

(Pulumi providers dev here)

This has been the case in the past but we are investing in our provider ecosystem. We built several first-party native providers that aren't based on TF: Kubernetes, Azure, Google. Now, we also encourage third-parties to build their integrations.

Here is a boilerplate repo of a resource-based provider: https://github.com/mikhailshilkov/pulumi-provider-boilerplat...

Here is a provider that is driven by an Open API spec: https://github.com/mikhailshilkov/pulumi-provider-boilerplat...

For simple use-cases, you've always been able to build Dynamic Providers in TypeScript or Python: https://www.pulumi.com/blog/dynamic-providers/

Please reach out if you want to build a provider and we'll definitely help you out.

mdaniel · on June 8, 2021

> If Pulumi didn't bless it, it doesn't exist in Pulumi's world.

That has not been my experience. I have personally ported a Sentry TF provider into Pulumi, and I will grant you that their docs and examples are bordering on active user hatred for exercising the process, but it does work:

https://github.com/pulumi/pulumi-terraform-bridge#adapting-a...

https://github.com/pulumi/pulumi-tf-provider-boilerplate#rea...

What mystifies me about that situation is that I do actually appreciate the amount of silliness that is required to avoid using Pulumi cloud: they are not financially incentivized to make that easy, but I'd guess a lot more folks would nope right out if they didn't make it possible

However, I would think they'd want to make ingesting a TF provider into Pulumi as smooth and reliable as possible, so they don't have people close their browser tab when they don't find a supported provider for Pulumi but it exists in TF

jen20 · on June 9, 2021

> This would be completely impossible with Pulumi. If Pulumi didn't bless it, it doesn't exist in Pulumi's world.

This is only true (temporarily) for automatic plug-in installation - and was until recently also true of Terraform. In fact I had to reverse engineer the TF provider registry protocol because the documentation is manifestly incorrect, recently.

$WORK has lots of Pulumi plug-ins which they know nothing of the existence of, and it works fine.

__jem · on June 8, 2021

Maybe I’m missing something, but I don’t think this is true? E.g., https://www.pulumi.com/blog/dynamic-providers/ There’s also an example of their blog on doing a schema migration with custom logic.

hacker_newz · on June 8, 2021

Why are you using Terraform for orchestration?

jaaames · on June 8, 2021

Can't agree enough.

Declarative programming makes sense for lots of things, React is a great example.

With such a big dependency graph for infra, adding loops and variables and templating to be able to achieve the same thing as Pulumi in a "declarative" way is ultimately just harder and worse than using a familiar powerful language with an SDK.

jen20 · on June 9, 2021

Worth noting that Pulumi IS declarative - the languages build a graph imperatively, but the evaluation is declarative in nature.

mhitza · on June 8, 2021

For me it's less about HCL annoyance nowadays, but more about discoverability. Using Pulumi I no longer have to memorize resource properties because I get IDE autocompletion.

frenchman99 · on June 8, 2021

Autocomplete is automatic in Intellij as far as I can see. I don't recall doing any kind of custom configuration to have it working. Autocomplete works on resource names, variable names, properties, etc.

giaour · on June 8, 2021

Autocomplete for Terraform/HCL is available, too, though you do have to use specific tooling (e.g., VS Code with the Terraform extension) rather than the same tools you use to work on JS.

jen20 · on June 8, 2021

The specific tool recommended here is simply not very good - despite the language server efforts, the IntelliJ HCL plugin is worlds apart from the VS Code tooling (and has been for years). Unfortunately it's not open source - if it were it would mean the availability of an open source implementation of a production quality HCL2 parser for the JVM ecosystem, which would be very useful.

sverhagen · on June 8, 2021

I have really liked the Terraform support in IntelliJ, but the "HashiCorp Terraform / HCL language support" plugin seems to have had its most-recent release on July 17, 2020[1]. And it clearly does not support a bunch of the newer constructs and properties. And that's just very unfortunate.

[1] https://plugins.jetbrains.com/plugin/7808-hashicorp-terrafor...

jen20 · on June 8, 2021

Any examples of things that aren't supported? It doesn't need to embed the metadata per-resource anymore.

sverhagen · on June 8, 2021

I'm seeing errors on each.value.foo when using for_each. Also, this gives me errors:

locals { foo = { for bar in local.bars: "${bar.x}.${bar.y}" => bar } }

Then, optional(bool) is "is not a valid type constructor".

Those all seem "language" aspects. For a resource like "github_branch_protection" it seems to not recognize the right properties. That seems to be more of provider issue.

stevehawk · on June 8, 2021

it took 5 years to get that useful for_each for modules though

so I'd imagine some people waited long enough that they moved on to better tools.

acdha · on June 8, 2021

What better tools do you have in mind? Most of the people I know in the space have been moving _to_ Terraform, although CDK has improved enough over CF to be appealing for people who are all in on Amazon.

thayne · on June 8, 2021

"better" is subjective. But Pulumi fixes a lot of the pain points of terraform for me.

staticassertion · on June 8, 2021

Pulumi has been such a breath of fresh air. It is the only tool that actually feels like it encompasses "infrastructure as code".

MehdiHK · on June 8, 2021

Terraform CDK is a thing too, if you want to go beyond AWS.

asenchi · on June 8, 2021

for_each is an anti-pattern for reliable Terraform IMO. Not sure it was worth the wait and there isn't much out there that can compare with the simplicity of Terraform.

devonbleak · on June 8, 2021

WAY more reliable than count which would do screwy things like rename a bunch of stuff and delete the last item if you removed an item from the middle of a list.

Complex architectures and reusable module encapsulation require a bit more complexity than HCL1 was capable of describing IMO (and apparently the O of most of the Internet). That doesn't necessarily make it less reliable.

Could I describe my infrastructure "reliably" just using raw resources with no loops? Sure but that sounds like a nightmare to both build and maintain.

ncmncm · on June 9, 2021

'Fraid you lost me at "YAML". Anything built on YAML seems like it has to indicate bad judgment at the root.

There could be reasons, but I don't know them.

nikau · on June 8, 2021

Its just an awful language.

Using it is like writing msdos batch files where you are constantly working around limitations and bizarre syntax.

takeda · on June 9, 2021

> Terraform is such an underappreciated tool

Are you kidding? It's the go to tool even for people who are brand new to IaaC.

If anything I would say CloudFormation is underappreciated a lot of reasons why TF was created were fixed almost a decade ago. TF users are still citing those things as the reason why they use TF without ever using it.

randomswede · on June 10, 2021

I have not looked closely at CF for a couple of years, but in late 2016, I actively preferred TF over CF. But, I understand that the XML-only has since been changed and since that was the only real issue I had with CF...

cactus2093 · on June 8, 2021

I haven't done a lot of infrastructure work in the past few years so haven't stayed super on top of the latest changes. I last used it heavily in the earlier days, roughly 4-7 years ago now. And while a lot of the community was great, put in a lot of work on the product, and generally wanted to improve the tool, there were also a lot of very vocal stodgy old timers that were really resistant to any improvements from the very earliest days. It definitely rubbed me the wrong way at times and made me want to look at alternatives.

I remember some old threads about loops for instance, and a lot of the core community was fully convinced that it was a terrible idea, nobody should ever need loops, and if you're a complete weirdo who does want them you should just use a separate templating language to generate your terraform configs instead. And when modules were first released, the support for using them as a means of local code encapsulation and reuse was pretty weak (it would for some reason hard-code absolute file paths in the tfstate file IIRC, so if one person ran a terraform plan on a state file somebody else had last pushed it would always show up as needing to be changed even if it was already up to date). Again I remember core developers insisting that nobody needs features for local code reuse, and modules are only needed for publishing public resources that others can pull in.

Anyway, by no means do I hate Terraform, but I definitely associate it with being unnecessarily clunky and convoluted and full of gotchas even for fairly common use cases. In my opinion that reputation is pretty deserved and built up over probably a hundred hours of experience struggling with it a few years ago. I'm glad to hear that it sounds like that is changing, but I'd still be very cautious and carefully evaluate all the newer alternatives before rushing back to use it again.

mtalantikite · on June 8, 2021

Things have changed since you last used it 4 years ago, so it's probably unfair to judge the tool now based on how it operated then. Most of these pain points (code reuse, state management, more robust HCL features) have been addressed. The one major thing I'd like to see are better LSP bindings for IDE support.

Terraform has been a great tool and it's always surprising to me to hear people hating on it.

cactus2093 · on June 8, 2021

It's a fine tool, but all the other comments as peers of mine highlight the same kinds of issues I mentioned and got completed downvoted for. So clearly there is something to it. Nobody is hating on Terraform, just trying to avoid choosing a tool that makes their job more difficult than alternatives.

jen20 · on June 9, 2021

> very vocal stodgy old timers that were really resistant to any improvements from the very earliest days

As one of the three maintainers of Terraform (for the core and all providers) in that time frame, your characterisation is not particularly accurate - likely hence the downvotes.

Many of the “suggestions” in that time frame were “we should do something and ‘X’ is something so we should do ‘X’” - which is to a large extent how TF came into being.

From the earliest days, breaking changes were avoided - policy which was not retained through later versions.

While you may have heard some “core developers” claim that reuse was unnecessary (I can’t claim omnipresence), the HashiCorp official training that I taught during that time period _used modules extensively_ for this.

femiagbabiaka · on June 8, 2021

agreed. before terraform the alternatives were terrible. remember cloudformation? never again. I'd rather use good patterns around Terraform design than ever go back.