Once upon a time I worked a contract gig at a credit union in the US. My first day in the datacenter (ground floor), I noticed the grill from a late 90s/early 2000 Ford F-150 pickup truck hanging on the wall above the door frame. I pointed it out and asked "why is that on the wall?"
The building was like many bank buildings, 4-5 stories, all glass exterior, off a busy main road. In particular, this building was parallel to the busy main road.
Turns out, the busy main road was perpendicular to another road, which lined up almost perfectly with the datacenter on the first floor. One Saturday night, a drunk driver ran the stop sign on the side street, blasted through the field leading up the datacenter, and drove their truck straight into the building, right through the offices surrounding the main server room. The force of the crash blew bricks across the room in all directions and missed the raised floor by maybe 3 meters.
Surprisingly, nothing went down. The datacenter now has "bulletproof" glass on that portion of the exterior, poles installed into the ground in front of the glass, and an earth berm raised above the road.
The sysadmins found the truck grill inside one of the offices under a desk. They kept it and hung it on the wall as a reminder to plan for any sort of disaster.
At one of my early jobs I was sysadmin for a local office of a small company that wrote simulation software. We had procured some rack-mounted servers that sounded like jet engines, so when we expanded to the office across the street, the boss took the opportunity to knock out two closets and make a server room. I remember having a conversation with him where he asked my advice on the size of A/C to buy. Despite having no knowledge on the subject, I had opinions, and I recommended getting the biggest A/C.
That all got installed, and it was great, we no longer had a jet engine in the room next door, and I felt awesome because not only was I managing 10 Linux machines but now I had a compute cluster. In a rack. With a little slide-out KVM thing that switched between the machines!
Well, several months later I started noticing mold on the side of the closet, and a thin condensation on the rack sometimes. So I got a temperature monitor and wrote a script to create pretty graphs, and between the low temperature and the central Texas humidity, we had a problem.
So the boss (who was in the New York office) had me talk to someone who actually knew something about A/C (because I had no idea why this was happening). Apparently I had spec'ed out an A/C large enough to cool the entire building, and that the problem I was experience was why one matches the A/C to the heat load. The solution was to add a heater to the A/C. So we ended up cooling a building's worth of air, then heating it up, and then sending it to cool 2U's of servers.
That reminds me of a time I was working at a site delivering a big project.
They had put up a big temporary building next to the project building to hold a lot of additional engineers. The temporary building seemed to be the commerical equivalent of 10 double-wide mobile homes side by side with the side walls removed. There's probably a name for it. The end result was a ramp from the main building, leading up and turning into a long room maybe 200' long and 40' wide filled with lots of desks with engineers.
There were also columns every so often for network and power drops.
There was also a thermostat every 20 feet or so. (see where this is going?)
I happened to be working late one night and entered this room after everyone was gone. And the funny thing was, even though it was cool outside, all the AC units seemed to be running and making a lot of noise.
Walking down the middle of the room and looking at the thermostats I realized what had happened. Everybody had their own "comfortable" and they were all set to different temperatures. The result was at night you had heating and ac running, sometimes right next to each other, in an epic battle.
I went to each thermostat and set it to the same value, and within a short time equilibrium was reached and the entire room quieted down as the compressors and fans wound down.
This sounds like a sophomore engineering student's weeklong nightmare of a thermodynamics assignment.
"Part f) Your boss realizes you don't have any money left in the budget to replace the A/C unit, but he knows a supplier who can deliver individual heaters for cheap. How many 1500W heaters will be needed to maintain a server room temperature of..."
I worked for a startup in the back of someone's house. They had built an extension on the back of their house which was basically a greenhouse (it was like a steel and glass construction). The glass was supposed to be thermally coated but apparently it was installed backwards, with the thermal coating on the inside not the outside.
So the story goes, we had our server room and workstations all in this addition. During the summer it was routinely at or above 100 deg air temperature. I had a small portable AC unit we ran 24/7 and eventually they got a dedicated AC unit for the room. We also ended up covering the glass roof with tarp to try to block out the sun.
It was a pretty insane setup, but it was a cash strapped thing and the owners didn't want to do more than what they did. I was a naive kid fresh out of school, and thought it was so cool to work at a startup - even if it was sweating constantly in a noise filled environment working nights and weekends.
That said, the startup succeeded and grew into a company.
There was also that one time there was a rain storm and the "greenhouse server room" was not sealed against the side of the house, and the rain poured in through a crack. I will never forget when my colleague called at 1AM to let me know the server room was flooded, and that one server, which had it's case open, had the case fill with rainwater - while it was still turned on and running. And it kept running as most of the water collected at the base of the server chassis.....
If you think this sounds crazy, you're right, but I'm sure somebody else has a story just as crazy or even more so!
Ah, the early days, before the cloud... when we did it all ourselves. Yup, I had to wear the HVAC hat too.
A few years ago, I worked for a hospital system which had their datacenter in Phoenix. We had a large (tens of thousands of square feet) datacenter in supporting a few dozen hospitals and its HVAC system was marginal. On a hot day it could barely keep the datacenter below 80 degrees. This was a facility with hundreds of millions of dollars in equipment, I don't think there were any systems which affected patient care, but pretty much every other type of hospital system was in that building.
When the AC did fail, the building temperature spiked up over 90 degrees within half an hour and eventually climbed up over 100. We spent most of the afternoon shutting down the least critical systems so the most important ones would stay online. Then slowly bringing them back online after HVAC was back online and the building started to cool again in the evening.
I was stunned that they didn't install some kind of redundant HVAC setup after that, or at least one that kept the room at 70 degrees on a hot day.
The other huge facilities failure at that facility was when electricians were working on the (ironically enough) backup power system. They accidentally cut power to the entire datacenter for about 30 seconds.
I've seen more server down-time due to facilities errors than computer hardware by a good margin.
I had an instructor for a class who had a similar story. He had requested a day off of work to take a short vacation, but because it was a small company, it meant that no IT personnel would be available for that day. He came in on Monday and was immediately called into his manager's office because apparently all the servers went down on the day he wasn't there, and they had to call in the other IT guy to fix everything. Turns out, one of the other managers in the office had turned off the dedicated AC to the server room because he thought IT was wasting money by running the AC when nobody was there.
Luckily they sided with my instructor when he said that it wasn't his fault and ended up firing the other manager for trying to cover up his own mistake.
30 ago I worked at a place where we had a server room full of Sun boxes and a Symbolics Lisp Machine. The server room was a repurposed office to which some A/C had been retrofitted. On one occasion the machines started randomly glitching. The first clue that the A/C had failed was the Symbolics printing 'help I'm getting hot' (I paraphrase) to a console I was monitoring out in the main open plan.
I worked as research staff for an OS research group at a university in the 90s. We had a new building, with a nice new machine room. However, the machine room had tightly controlled access which was limited to department IT staff only, and we wanted our grad students to have physical access to our machines. So we took over a "lab" that was never intended to host machines.
By the time I left, we had 2 racks with networking gear (Myrinet and 1Gb/s ethernet, which was a big deal in the 90s), and probably 40+ 1U servers, plus several tables full of larger pedestal storage servers (one of the projects was a cluster filesystem).
Just like in Rachael's story, we would have times where the HVAC would go out. I remember dragging giant 4' fans up from the basement to put in the doors of the room to cool it off during HVAC failures.
My last client had their main nationwide server, that their company depended upon, in a stuffy closet under a fire sprinkler. No it never went off, but dang.
Anyway folks in Enterprise can forget, any organization under a certain size doesn't have any special consideration for their computer systems. They just grew from some PC on a desk to whatever server/router/firewall situation they are in now.
My favourite of that was when we grew from "some PCs plugged into extension leads from a floorbox" to four full racks of 2U servers, without ever upgrading the power distribution. Until someone noticed that one of the consumer-grade extension leads powering the whole setup had gone brown ...
Oh, and if there was a power cut you'd have to disconnect the racks and bring them back one at a time or the inrush current would trip the building breaker again.
I recall hearing years ago of someone modifying, I believe, the BIOS on a fileserver to spin up the hard drives one at a time so that the inrush current didn't cause brownouts (on the PSU or on the building circuit I don't recall)
> The biggest change was allowing only one drive per tray to be powered on at a time. In fact, to ensure that a software bug doesn’t power all drives on by mistake and blow fuses in a data center, we updated the firmware in the drive controller to enforce this constraint. The machines could power up with no drives receiving any power whatsoever, leaving our software to then control their duty cycle.
I've seen too many little server rooms where the AC unit is installed above the false ceiling directly above the server rack.
It makes sense, in a way... except that the condensation collection lines are also right above the server rack, and sooner or later someone forgets to do maintenance and you spring a leak.
Put it over the walkway, for pity's sake (also, then you can work on it with a fully populated server room!)
My highscools server rack was in the basement with an AC unit, which pumped it's exhaust heat right into another adjoining room that functioned as an equiment room for my extracurricular. We had to get in and out as fast as we could because that room was like 100 degrees
In the mid-90's, I worked at an early ISP that had most of its network and dialup equipment in a small retail / office plaza. It was in the basement of a unit that they had rented to someone else.
There were about 100 individual phone lines coming off the wall, each going to its own modem. Basically, a river of phone cables. Each modem had its own power supply and serial cable. "Power distribution" consisted of power strips chained 2 or 3 layers deep.
There was no cooling to speak of. I remember going down there in July or August and it felt like the place was going to melt down. Some modems, quite literally, had melted: the plastic was warped and discolored.
I had a client that didn't like to use passwords on their server because he could never remember them when he wanted to check something. His security solution was to put the server in a steel locker with 1 hole just large enough to pass a cord from a power strip and a CAT5 cable out of the side. Then lock the locker so no one had physical access to the server.
He did all this on his own on a Friday and then he left the CRT on all weekend. I got a call on Monday letting me know that "my" server had failed and I needed to get out there immediately. And of course, when I got there he was gone with the only key to the locker.
We were going to move a server one evening and we told the head of IT ahead of time we needed to move it at 6. He agrees.
At 5:45 we roll into the server room area and he has gone home for the evening. Server room is locked.
Someone as a Hail Mary asks building security, and they surprise us by saying of course we have keys. So they come in and try the doors (the server room had been expanded to fill several offices, so there were 3 doors). No luck with any of them. Which upset the security person because they were supposed to have keys.
So security wanders off before we start trying to jimmy the locks, and it turns out that the middle door accepts Mastercard. It was never mentioned again and the IT dude never asked (which he should have, really).
I don't know that I'd play it the same today. Back doors are put in or tolerated primarily because of fear of, or the reality that, someone isn't doing their job or some group is making a power grab (or typically, both). It costs political capital to call someone out on that but the alternative is to have people lying (even by omission) about known security issues with the system or facilities.
Personally, I'd rather have a couple people who have credentials or keys they are sworn 'never to use', except in extraordinary circumstances.
> How do you deal with a tech who leaves a system in a state where it will do nothing but dump hot air at full speed all night long?
Well, this is why organizations no longer keep server fleets in random on-premise used-to-be-copier repair rooms. (Or do they? I wonder if school districts might still be doing that...)
Speaking from my friend's experience, school districts totally still do that. Their uptime only has to be good, not amazing. I don't think he's ever run into any HVAC issues, though.
I didn't care about football, I cared about computers. My entire school district didn't have a hint of CS more advanced than learning how to use Microsoft Office. Additionally, there are ways to play football outside of a school team, I played soccer for many years outside of school and enjoyed myself well enough.
Here in Baltimore many public schools don't have A/C. (Heck, many don't even have reliable heat, but that wouldn't bother a server as much). Hey, at least it (probably... dear lord I hope) forestalls that solution.
Organizations most definitely still do that, including quite large ones that you might think should know better.
Most large retailers have a decent amount of compute equipment at each store. Lot's of manufacturing facilities have critical IT equipment sitting right next to the line, and if said equipment goes down the line grinds to a halt etc.
Yes, that's a Walmart window A/C sitting on a filing cabinet with a box fan teetering on top. No, it's not vented to the outside. So the temperature in the room continued to rise.
One of my early jobs, I worked for a group of remote sensing scientists that analyzed weather satellite data and the equipment I was responsible for was in an improvised raised floor room with probably the smallest unit Liebert produced. I found out one weekend why there were huge fans scattered about the office - the Liebert sometimes failed for various reasons and times and the 5x 52U racks of Dell PE and Sun E3800 servers and storage got super hot super quick.
I had to rush in the office with my boss, a brilliant remote-sensing PhD who flew into hurricanes with P-3 Orions to make sure the data they were gathering was accurate with what they were getting from the microwave scatterometer on the satellite,
helping him setup fans and shut down equipment that wasn't absolutely necessary.
After that, I resolved to be alerted proactively and scared up a setup with some Sensatronics temp sensors, cacti and nagios so I could get data over time as well as alerts when over a threshold. I also enabled the openmanage alerting and found you could get the ambient temp at the air intake of a poweredge so I got that into SNMP and setup alerts on that as well. Fun times.
Side note: when I first walked in to the server room, I saw some Apple Xserve RAIDs and thought that was odd next to some Dell stuff and he said, those are the only storage units that can handle the G-forces we experience in hurricanes. "The Powervaults just fall right over"
They still had Sun equipment since RHEL was still at v4 and was not fully 64-bit, while Solaris + SPARC had always been 64-bit, and the dataset sizes they were crunching required being able to access RAM-per-process of more than 4GB. MUCH more than.
As soon as you start having you own server/network rooms, yes.
You will have to monitor your HVAC systems, your fire extinguisher systems, your alarm systems (for smoke and water leak detection), your electrical systems... You will need to make decisions about which type of system to install. You will have to set up maintenance and repair of these systems and interact with the specialized technicians who come to service them.
Not saying you have to become an HVAC expert, but you will definitely learn some stuff outside of computing/networking.
At a basic level, most small companies still need a firewall/router and some switches in a closet. Maybe a local backup server? And a UPS to keep it going, and now you need to worry about the temperature.
Do you own the building? Probably not. Does the building landlord keep the HVAC going on nights and weekends? Probably only at a level sufficient to get things back to normal around 8AM the next work-day. So even if you install your own A/C: do you have sufficient power? Where does the exhaust go? Do you have a drain for the water you take out of the air, or a bucket? A nice bucket full of slightly oily water next to your machines? Better make sure there's nothing on the floor.
Add a few temperature sensors and flood-detectors, too. Rig an alert system.
At some level, sysadmins are at the boundary between hardware and software, between the real physical world and the abstract manipulation of symbols and data. With that in mind, you can ignore HVAC to the extent that "facilities" understands your needs and keeps them in mind during routine operations over a year, a day, or a month, from their project managers down to whoever is on site that day perhaps testing generators, switching away from the chilled water loop now that it is winter, and so on.
As a programmer who relies on these services, if the server room "feels hot" but I haven't gotten anything out of monitoring, I will stick my head in. Your monitoring and your A/C can fail at the same time and I have had that happen more than once.
Yes, because the ordinary HVAC service runs on a "best effort" basis where some downtime or reduced capacity is acceptable and you can try and fix stuff if someone complains; but for a datacenter you suddenly need 24/7 99%+ uptime with rapid response in case of failure, because losing AC is only just a bit worse than losing power, and it's easier to have redundant power than truly redundant AC.
datacenters are competitive HVAC implementations with computers inside. its all hyper-abstracted in the cloud for most people now, but anyone who dealt with on-prem equipment pre-cloud knew their buildings facilities and maintenance team well (as well as basic cfm, btu, and kva numbers).
I happen to have a friend in HVAC. I bought some gauges and basic tools and he trained me on troubleshooting the AC, its really helpful.
Having redundancy or at least a backup plan is better. You have to consider all of the risks and what is at stake and how much downtime you can handle.
Many years ago a brand new server room with its own dedicated HVAC systems being used for electronic home arrest nationwide started having all the machines fail each night every weekend. As you can imagine this was a panic mode situation and they had some of the sysadmins stay overnight. It turns out that the HVAC people despite knowing what it was used for, told the and having in fact been in the room setting up the underfloor programmed the A/C to shut off over the weekend to save electricity. When drama flows downhill from the Feds no-one is happy and the resulting meeting was almost as uncomfortable as the one after Qwest deliberately cut both sides of a fibre ring and then tried to sell us service.
(We later found out Qwest's field people didn't want to cut a different carrier's fibre so they called back to confirm and were told to cut it or not come back in the morning.)
In the 90s, I worked at a little place that was crammed full of PCs and other electronic equipment. It got uncomfortably hot in the summer. A couple of us went around to all the PCs and changed the settings - instead of a screensaver, turn off the monitor after 10 minutes of inactivity. Problem solved. The age of CRTs...
The building was like many bank buildings, 4-5 stories, all glass exterior, off a busy main road. In particular, this building was parallel to the busy main road.
Turns out, the busy main road was perpendicular to another road, which lined up almost perfectly with the datacenter on the first floor. One Saturday night, a drunk driver ran the stop sign on the side street, blasted through the field leading up the datacenter, and drove their truck straight into the building, right through the offices surrounding the main server room. The force of the crash blew bricks across the room in all directions and missed the raised floor by maybe 3 meters.
Surprisingly, nothing went down. The datacenter now has "bulletproof" glass on that portion of the exterior, poles installed into the ground in front of the glass, and an earth berm raised above the road.
The sysadmins found the truck grill inside one of the offices under a desk. They kept it and hung it on the wall as a reminder to plan for any sort of disaster.