I did/do run both myself, Kubernetes and Nomad, and it was a million times easie...

throwaway894345 · on Feb 15, 2021

I would be very interested in a more detailed write-up on Nomad vs Kubernetes for bare metal. I'm working through getting Kubernetes stood up, but I'm running into a dearth of features--namely you have to bring your own load balancer provider, storage provider, ingress controller, external DNS, monitoring, secret encryption, etc, etc before you can run any real world applications on top of it. I would be interested in how Nomad compares.

EDIT: Downvoters, I'm really curious what you're objecting to above, specifically.

jordanbeiber · on Feb 15, 2021

It’s much easier with nomad as you are not forced in to a ”black box” with networking layers and surrounding requirements.

Bare metal nomad - use with consul and hook up traefik with consul backend. This would be the simplest, most ”zero conf”, way to go.

I’ve used this setup for a few years heavy production use (e-commerce & 50 devs)

As consul presents SRV records you can hook up a LB using those, or use nomad/consul templating to configure one.

Service mesh with mTLS is actually rather approachable and we’ve deployed it on selected services where we need to track access and have stricter security. (This however had us move off traefik and in to nginx + openresty)

Now if you want secrets management on steroids you’ll want vault. It’s really in many ways at the heart of things. It raises complexity, but the things that you can do with the nomad/consul/vault stack is fantastic.

Currently we use vault for ssl pki, secrets management for services & ci/cd, and ssh cert pki.

These things really form a coherent whole and each component is useful on it own.

Compared to k8s it’s a much more versatile stack although not as much of a “framework” and more like individual “libs”.

I always come back to the description: “more in line with the unix philosophy”.

In a mixed environment where you have some legacy and/or servers to manage I think using the hashicorp stack is a no brainer - consul and vault are tools I wouldn’t want to be without.

sofixa · on Feb 15, 2021

I'm currently working on an article exploring Nomad and how it compares in some aspects with Kubernetes, which i'll post on HN soon-ish.

ForHackernews · on Feb 15, 2021

I'd be really interested, too. Have you looked at k3s at all? We're considering trying to run https://github.com/rancher/k3os on bare metal servers.

throwaway894345 · on Feb 15, 2021

Yeah, that's what I used. It comes with some providers out of the box, but they strike me as toys. For example, it gives you support for node-local volumes, but I don't really want to have to rely on my pods being scheduled on specific nodes (the nodes with the data). Even if you're okay with this, you still have to solve for data redundancy and/or backup yourself. The Rancher folks have a solution for this in the form of LongHorn, so maybe we can expect that to be integrated into k3s in the future. There's also no external DNS support at all, and IIRC the default load-balancer provider (Klipper LB, which itself seems to be not very well documented) assigns node IPs to services (at random, as far as I can tell) so it's difficult to bind a DNS name to a service without something dynamically updating the records whenever k8s changes the service's external IP address (and even then, this is not a recipe for "high availability" since the DNS caches will be out of date for some period of time). Basically k8s is still immature for bare metal; distributions will catch up in time, but for now a lot of the hype outpaces reality.

dilyevsky · on Feb 15, 2021

There’s metallb that lets you announce bgp to upstream routers. Another solution would be to just announce it via daemonset on every node and setup a nodeport. Or just add every frontend node IP into DNS. Obv all highly non-standard as it depends on your specific setup

throwaway894345 · on Feb 15, 2021

Yes, to be clear, these problems can be worked around (although many such workarounds have their own tradeoffs that must be considered in the context of the rest of your stack as well as your application requirements); I was observing that the defaults are not what I would consider to be production-ready.

dilyevsky · on Feb 15, 2021

I don’t think kubernetes ever promised to be a turnkey system at least outside of cloud. There are many commercial vendors though willing to fill that gap.

throwaway894345 · on Feb 16, 2021

> I don’t think kubernetes ever promised to be a turnkey system

No one is arguing that they did.

fomine3 · on Feb 16, 2021

Obviously original authors want users to use their cloud.

ForHackernews · on Feb 15, 2021

I wonder how their Rio thing stacks up? https://rancher.com/blog/2019/introducing-rio

chrischen · on Feb 16, 2021

Any thoughts on RKE which runs a full k8s distro? Was able to deploy bare metal with just a single cluster config and “rke up”.

BurritoAlPastor · on Feb 15, 2021

You have to bring all of those same things to a Nomad deployment as well. It’s generally more lightweight than Kubernetes, so it might be easier to wire those other components in, but you do still need to do that work either way.

64mb · on Feb 15, 2021

> it might be easier to wire those other components in

IMO the few lines of yaml to set the path/host for an Ingress definition seems cleaner to me than using consul-template to spit out some LB config (as in the post's example).

For simplicity, a few years ago I preferred Traefik + Swarm. Add a label or two and you're done. But Swarm died :/

chucky_z · on Feb 15, 2021

Let me try to do some quick mapping...

> load balancer provider

consul connect handles this, how you get traffic to the ingresses is still DIY... kinda. you can also use consul catalog + traefik (I've actually put in some PRs myself to make traefik work with a really huge consul catalog so you can scale it to fronting thousands of services at once). there's also fabio. you can also get bgp ip injection with consul via https://github.com/mayuresh82/gocast run as a system job to get traffic to any LB (or any workload) if that's an option.

i've also ran haproxy and openresty without any problems getting stuff from consul catalog via nomad's template stanza and just signaling them on catalog changes.

> storage provider

anything CSI that doesn't have a 100% reliance on k8s works. if you're also just running docker underneath you can use anything compatible with docker volumes, like Portworx.

> ingress controller

consul connect ingress! or traefik, both kinda serve double duty here.

> external DNS

no good story here -- with one exception, if by "external" you mean "in the same DC but not the same host," consul provides a full DNS interface that we get a lot of mileage out of.

if you're managing everything with terraform though there's no reason you can't tie tf applies to route53/ns1/dyn or anything else though!

> monitoring

open up consul/nomad's prometheus settings and schedule vmagent on each node as a system job to scrape and dump somewhere. :)

we also use/have used/will use telegraf in some situations -- victoriametrics outright accepts influx protocol so you can do telegraf/vector => victoriametrics if you want to do that instead.

> secret encryption

this is all vault. don't be afraid of vault! vault is probably hashicorp's best product and it seems heavy but it's really not.

there's a lot here that doesn't really compare at all, like the exec/raw_exec drivers. we use those today to run some exotic workloads that do really poorly in containers or that have special networking needs that can map into containers but require a lot of extra operational effort, e.g.: glb-director and haproxy running GUE tunnels.

something interesting about the above is i'm testing putting the above in the same network namespace, so you can have containerized and non-containerized workloads in the same network cgroup namespace so you can share local networking across different task runners.

jokethrowaway · on Feb 15, 2021

I had the opposite experience (in 2 different companies). Setting up K8s was quite straightforward and docs were helpful. We ended up building a deployment UI for it though.

Consul is nice and easy to use.

Nomad has been a painful experience: the default UI is confusing (people accidentally killed live containers), we have some small bits and pieces that don't quite behave as we expect and have no idea how to fix them. Error rate is too low to care and there are more pressing issues so likely WONTFIX. We often found ourselves looking into github issues for edge cases or over-allocating resources to overcome scheduling problems.

We considered just switching to their paid offering, just not to have to worry about this.

It kind of feels like that's their business model: attract engineers with OSS software and then upsell the paid version without all the warts.

tutfbhuf · on Feb 15, 2021

Yup. Setting up k8s with kubeadm on bare-metal is very straight forward and can be done within a few minutes on any linux host that is supported by kubeadm and docker + ssh access.

mahesh_rm · on Feb 15, 2021

>> I plan to write an article/kind-of-tutorial about the setup.

Please post it here when you do. :)

mad_vill · on Feb 15, 2021

bear metal sounds so sick.

I'm in for an industrywide rename :D.

qchris · on Feb 15, 2021

There's probably a startup name in there somewhere.

Bear Metal Semiconductor. Bear Metal Fabrication. Bear Metal Labs. You could get so creative with the branding and logo.

fomine3 · on Feb 16, 2021

It's sold as Thermal Grizzly Conductonaut

marvinblum · on Feb 15, 2021

Woops :D

GordonS · on Feb 15, 2021

Are you using Nomad to schedule containers on those nodes? I'd also be really interested in a blog post or write-up about your setup!

marvinblum · on Feb 15, 2021

Sure, that's what it is for :) We also run cron jobs, Postgres, and Traefik.

rugwirobaker · on Feb 21, 2021

Nice product by the way. I might try it later on of my upcoming projects