Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
It Was Never About Ops (biven.org)
71 points by michaelbiven on Nov 29, 2016 | hide | past | favorite | 52 comments


It's fun to watch people learn the systems administration trade, but it's a little disheartening to watch them conclude -- incorrectly -- that proper systems administration is some kind of software engineering job.

Is this why Silicon Valley is allergic to the phrase 'systems administrator'? Ops, Devops, SRE... so many phrases to describe various jobs which all, when executed successfully, converge upon the same set of tasks, and the same approaches to them, that has been labeled 'systems administration' for thirty years or more.

What is the motivation behind this trend?


Being a good sysadmin in 2016 requires a skill in programming. The ability to write apps and services that interact with your infrastructure's APIs is necessary to automate all the toil we've done since the beginning. (See google's SRE book)

As I have more control over hiring decisions on my team, an inability to use any programming language effectively (bash, python, perl, /anything/) would disqualify someone from a job in sysadmin or information security.


Being a good sysadmin in 1996 and 2006 also required skills in programming. Sysadmins who don't code have always been limited in their career evolution. The same is true for developers who don't know anything about operations.

The focus on automation, and it being called DevOps, is simply an evolution of the work of sysadmins. The same way devs evolved toward TDD and other modern programming techniques.


The focus on automation, and it being called DevOps, is simply an evolution of the work of sysadmins.

I believe you have this exactly backwards.

The problem is that we keep silo'ing everything and inventing new buzzwords and half-assed tech. We can split things into smaller pieces quicker than we could ever staff up.

Nothing has changed since the 80s. The only thing that's different now is that instead of programming in one little pond of a language, now everything you do is programmable. That doesn't mean you need 47 specialists. It means you need to get really good at managing and limiting complexity.

We don't want to split off and evolve roles into newer and even more specialized roles. This is the thing that made deployment and updating so whacked to begin with.

</rant>


DevOps is not about automation of Ops work. DevOps is about replacement of Ops with code.


Can you clarify the difference between your first and second statement? Isn't replacing Ops with code just automating Ops work?


Difference is thin, as between "automatic", "fully automatic", "intelligent" or "smart" systems. Or as between "programmer", "developer", "software engineer".

Ops team may create an automatic solution, e.g. script to call with parameters and a runbook. DevOps will create fully automated solution, without need for a runbook.

By definition, DevOps is developer, (an one who uses Agile techniques, such as automated test cases, CI, source control systems, ticket management systems, etc.), who is able to do Ops tasks with code.


You have a different definition than I do. I'll copy-paste Wikipedia since it matches my understanding:

> DevOps is a set of practices that emphasizes the collaboration and communication of both software developers and other information-technology (IT) professionals while automating the process of software delivery and infrastructure changes.

DevOps certainly does not mean that Dev has to do Ops now, too, although it's frequently misunderstood as meaning that, and IMO that usually leads to a lot of tears in the long run.


«DevOps» term originated from conference named «DevOps»: http://www.jedi.be/blog/2009/12/22/charting-out-devops-ideas...

Wikipedia quotes description of tne conference, not a description of DevOps role. However, if you read carefully original article, you will see that main idea was to close gap between ops and dev team by bringing dev techniques, such as Agile, to ops team. So, DevOps IS developer with Ops knowledge.


> Being a good sysadmin in 2016 requires a skill in programming.

This condition has held true as long as there have been systems to administer. Only the names have changed; that's what I'm curious about.


> Is this why Silicon Valley is allergic to the phrase 'systems administrator'

There is no need for 'systems administrator' in the Silicon Valley.

The systems administrators are the guys who buy the desktop computers, setup windows, and kinda manage the active directory.

DevOps/SRE names exist because the tech scene had to stop the confusion between the hardcore linux admins who can code and the dude who can setup your desktop computer. The first is needed in dev companies mostly located in the tech hubs, the second is needed in most companies all around the world. The two have very little in common.


> DevOps/SRE names exist because the tech scene had to stop the confusion between the hardcore linux admins who can code and the dude who can setup your desktop computer.

If that confusion existed in any otherwise-competent organization, it was only in the valley. I was a systems administrator before any of "SRE," "DevOps," or "linux" came to be, and I don't think anyone has ever confused me for tech support.


Most of the working Windows system administrators I have met could not code their way out of a wet paper bag (but a majority could code their way into one). If you've been unlucky enough to work on a "managed" workstation, their near-total unawareness of optimization, state, and exception handling is why your Windows login script takes 10 minutes to crunch away.


To be fair, windows scripting is pretty limited. Then you have powershell which is way to bloated for quick and dirty things.


Most of the nasty Windows scripts I've seen were VBScript, used to bludgeon about various WMI providers and to launch unfortunate executables. I don't think the language was much of a hindrance on its own, but it did nothing to repel stupidity.

Several years on, I still don't know what to think about Powershell, it's basically .NET + pipelines + some discoverability aids, not terrible in concept but I will probably go on ignoring it for the rest of its life.


The guy that sets up your laptop is an IT tech, not an SA. The SA is the hardcore linux admin who can code. From my perspective, and SRE is a SE that focuses entirely on back end performance and reliability, in order to support the SAs. DevOps is a term used by management to justify cost-savings by avoiding hiring someone to ensure system reliability, consistency, safety, and integrity.


Maybe in places that started well: I have no doubt that there have always been people out there that were doing system administration and could code.

The places where I have seen DevOps become popular are those that completely separated programming keeping servers running, and where the people keeping said servers running were incapable of doing even mild automation. Places where rebooting a non-db server requires making a request 3 weeks in advance, and being OK with 4 hours of downtime.

In those environments, those people that are called system administrators are very low productivity, can't help you with a performance problem if their life depends on them, and are easily replaceable with shell scripts. It's in those environments where DevOps and enterprise cloud migrations are popular: The developers might not be experts in scalability or in getting great uptime, but at least they can get something done. Ultimately this gives management an excuse to lay off all the sysadmins that don't know what /proc is, and make technical decisions based on Gartner reports.

Again, that doesn't mean that everyone that has the title of sysadmin in the world can't code or is unproductive: This is not a judgement on the value of a good SA. I am just trying to open a window to a world that many developers get to see, but few good SAs notice, because there's no way in hell a good SA would work in said companies for long.


The people who take care of internal user systems are IT. IT may have sysadmins, but a sysadmin is not necessarily IT.


That's his point I think? That there is a way to differentiate the different skill sets of different admins.


"The systems administrators are the guys who buy the desktop computers, setup windows, and kinda manage the active directory."


Since when was the dude who can setup your desktop a sys admin? That would be tech support.


A major, Fortune 100-500 firm in my area had many of their Win7 machines rolled out over the network by a guy that knows nothing about networking, can't tell cache from RAM unless explicitly marked, and calls me for basic, tech support on occasion. That it worked out fine with basic automation in place supports your assertion. ;)


It's not just about that role, it's more about the interface between the people responsible for running the code and the people responsible for writing it, and to what degree each should be able to do the other's job. Just throwing shit over the wall isn't any good either.


This is a common answer which presumes that all businesses are development companies, or that they are development companies which are legally allowed this interplay between the programming team and the production team.

There are places (places with a lot of money on the line) that do not fall into these categories, and yet the solutions are not so different. I'm just intrigued by the level of effort Silicon Valley is willing to exert to avoid recognizing this -- or even admitting that it's possible.


Hmm, OK. Yeah, if you legally have to separate the two roles then I guess you'd do that. There still needs to be some sort of interface, though, that's all I am saying.

For example, in places I've worked there was a "definition of done" where in order to have a new service be taken on by ops, it had to meet certain criteria so that the people running all these services had some common basis for diagnosing and running them. You don't want 100 different ways of calling a healthcheck on a service. You want service nodes to be able to handle a hard shutdown without data loss, maybe. Stuff like that


If you need to do backend and front end ("full stack"), know devops, know cloud primitives, know database fundamentals, know document storage (Elasticsearch), what are you going to actually be good at? Not much. You can't have both depth and breadth of knowledge, unless you're doing unhuman levels of cocaine and never sleeping.


I think you can be good at all those things, just not at 23.


Even if you're good at all those things, can you imagine keeping up in all those areas?


Of course, I was just speaking to why this is happening, not my grand scheme for how this should all work. Personally I would ideally like to just commit code and be done with it. Being on-call sucks.


There is a remarkable difference between "knowing Elasticsearch" and being able to debug it when shit hits the fan in the middle of the night.


A lot of people doing software have some sysadmin experience, and a small company probably doesn't have enough work for a full time sysadmin. And a part time sysadmin won't work because you need on-call.


It's a niche. It's how I earn my living. I work for small companies that don't have enough daily work to keep someone like me around, but they can pay me a retainer to be on call when Something Bad™ happens.

It's actually pretty fun to have an ear in a wide variety of businesses, you wind up learning a lot outside of the technical world.


I suppose the natural question to ask here is if one thinks it is actually possible to be truly "on call" at multiple places simultaneously.

Is the extent of your agreement service within a day or something shorter/longer?


Response time guarantees are hard to fulfill even by the big guys. I remember being in a coloc years ago where a power problem took out a row of racks filled mostly with Sun equipment owned by various customers (my small company included). It also caused some impressive damage to a transformer outside.

Even with a Gold support plan with 2 hour guaranteed on-site response, it took nearly 2 days to get our servers replaced and running, but the bigger guys were back up within few hours of the outage. Sun simply didn't have the staff or immediate spares available to move any faster -- their response was still impressive - they brought in an 18 wheeler full of spares the next day.

They ended up giving us some months of free support or something like that to make up for the delayed response.


It depends on how much they are willing to pay. For most, they pay enough to get a phone call within the hour during normal business hours, an email response within the day, and a 24-hour window if something happens on-site that requires my physical presence instead of remoting in.

I do have a couple of clients that pay quite a bit more for a call-and-fix any time guarantee, but I only offer that level of service to businesses where 1: there is at least one person on their regular staff who is technically competent enough I can walk through more advanced issues and they will understand how to follow instructions, and 2: I'm the one that built out their network, so I know the gritty details.

For the times when the shit hits the fan in two places at once, I have friends who do similar work, and there are a couple of guys I can call to cover if needed. Obviously they get most of the money from these instances, but the customer stays happy, and I get larger monthly residuals than I could otherwise take care of, so everybody wins.


Well, except they don't really have sysadmin experience. They set up a server for their development environment once, and because of that, they get turned into the company sysadmin. So they set up the companies production environment, and the lack of experience shows. Multiple single points of failure, inadequate (if at all) backups, poor network design, etc.

One favorite situation I ran into: a database server with a RAID card set to do write caching without NVRAM on a server with a single power supply (so two bad things in one). An overloaded power strip blew a circuit breaker and when the server went down, the filesystem their database was running on was unrecoverable. Oh, and of course they weren't doing backups because they had RAID!


One of the first questions I ask techs I'm considering bringing on to help is if RAID is a backup. If I see them physically cringe at the thought, I know they've got potential.


I'm an awful sysadmin, but why would anyone thing RAID is a backup? Two minutes of googling should show that it's not.


The number of people who understand the difference between resiliency and duplication, even in the tech world, is smaller than you would think. And far too many people think they are good sysadmins because they installed a file server that one time.

"Thankfully", problems like CryptoLocker have, paradoxically, made my job as a sysadmin easier, as now most companies I work for are familiar with the horror stories of how someone lost years of work/all their bay pictures/etc and did not have a reliable, cold-storage backup to deal to recover from such attacks.


> Is this why Silicon Valley is allergic to the phrase 'systems administrator'? Ops, Devops, SRE...

Titles are a good way for management to increase responsibilities for an employee, without having to pay them more.


That's funny, I would say the exact opposite is more often the case.


[removed]


In my humble opinion, based on my humble work experience, this trend has not originated with overconfident engineers, but rather with cost-cutting managers.


I removed my comment because it painted with an overly large brush.


DevOps is not a Ops at all.

DevOps is a developer, who writes code to do tasks which are often done manually by Ops team. E.g. installation, configuration, assigning, recovery from failure, upgrade, migration to new version of database, etc.


> Is this why Silicon Valley is allergic to the phrase 'systems administrator'?

'系统管理员'

('xìtǒng guǎnlǐ yuán')

I'm thinking of doing the traditional anglo thing and getting a tattoo of chinese characters I'm not really familiar with. I'll tell people that it means "fiery heart of the tiger" or "blessed heaven spirit" or somesuch... :)


> In short you can …

> 1. Expect services to grow at a non-linear rate.

Why would you expect this?


My reading was not that you can expect to be a unicorn and grow to Facebook proportions, but that your Growth Model is non-linear. What I mean by that is when your growth occurs, it occurs sharply.


Yeah no, the online banking site for e.g. chase.com isn't going to get non-linear or "sharp" growth. Nor is their internal project time reporting system.


> Scale your people by giving them the time and space to focus on scaling your products.

This rings true.


Couldn't agree more. If you have a monolith, you can probably get away doing dev ops as long as you don't grow too much. We have a handful of microservices and trying to keep up with the ops work has been hell, though part of that is due to AWS's tools. And this is without even a growing user base, just a growing app base. Burn out is definitely a problem in either case.


This is interesting. I've had a lot of people on the internet tell me that a monolith must become unmaintainable at some point, and the only way to keep it serviceable is to break it up into microservices.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: