Blog 762: Machine Reborn

There comes a time in everyone’s life when their computer dies unexpectedly. To be honest, I should be glad I’ve lasted twenty-odd years of computing without it ever happening. Every machine I’ve discarded has been because I wanted more power, not because they were from life untimely ripped.

Alas, it happened a few weeks ago to dear old Helios (named for the AI in Deus Ex), about 4 and a half years into his projected 5 to 6 year lifespan. This is the story of his death… and his resurrection.

Death

It was a confusing death.

There I was, happily going about my business, when suddenly… “There is no ethernet cable plugged in.” Uh? When such things happen I tend to blame my BT Home Hub because it’s a bit stupid, so I dutifully went and plugged Airlane the tiny laptoblet (named for the Gary Numan instrumental) in directly to confirm… and the internet was fine. How bizarre!

By the time I returned to Helios, barely a minute later, he was dead. Lights on, fans spinning, but unresponsive.

Hardware has never been in my comfort zone. I’ve wrangled every version of Windows since 98 (yes, even ME and Vista) so I’m well aware of how to contort it into working exactly the way I want it in any situation (well, except 10, where all the good options have been removed). I’ve always been scared of the underlying tech, though. This is the exact reason why I bought Helios as a custom build from Scan instead of constructing him myself.

I mean, I’ve done my fair share of adding extra RAM and swapping out hard drives, and even a graphics card or two, but these things are easily accessible. They’re on the surface, they’re a few clickity switches and maybe a screw or two from coming in or out. They’re designed to be easily replaceable.

But a first pass investigation said it was nothing so simple. Oh no. This problem had to be right in his poor, rotten core.

Post-Mortem

Initial investigations provided ambiguous results.

Pressing the power button caused an intermittent response. It would sometimes turn on for a second, then off again. It would sometimes power up, get a good way through booting (at least, according to the little debug digits), then die. Crucially, there were no “this is bad” lights lit up on the board itself, so it appeared that all the components inside were in good shape. So why wasn’t it starting?

The internet provided three possible causes for this behaviour. One: the processor is dead. Two: the motherboard is dead. Three: the power supply is dead. Unfortunately, the internet didn’t provide many ways to work out which one was dead, short of swapping them in and out (hint: I don’t have spare motherboards and processors lying around).

In a stunning display of wishful thinking, I veered towards it being the power supply. Although this is the beating heart of the system, it’s also the easiest of those three parts to replace — it’s on the edge, with a network of wires connecting it to everything else. To test this theory, I unplugged all the wires and performed… “the paperclip test”. This test scared me because it involves, quite simply, sticking a paperclip between two holes in one of the plugs and turning it on, “jump-starting” it. If the supply is good, it’ll activate; if it’s bad, it won’t.

It didn’t activate. Luckily, at just under five years old, the power supply was still within manufacturer’s warranty and I sent off for a replacement.

Minor Surgery

The replacement was of course slightly different than the old one, being four and a half years newer. Not by much — it’s just that the “24-pin ATX” socket actually has 26 pins now. Thanks lads! So while every other wire went back in just fine, the main motherboard power cord had to be replaced.

This is where I had my first battle with cable management. All the internal wires were tied together and strapped to the case with unbreakable sticky clips — beautifully organised, but only for the precise existing setup. To get that cable out so I could put the new one in, I had to cut my way through.

Fine, whatever, I managed it in the end. I hit the power and…

Nope. Still dead.

The bottom fell out of my stomach. It’s not the power supply. That means it’s the motherboard and/or the processor.

The expensive, difficult, scary bits. Oh dear.

(To be fair, it’s not impossible that all three components had failed. Perhaps the power supply died, taking the motherboard out, which then killed the processor. I guess we’ll never know for sure. Well, unless you want to take this old mobo+CPU combo off my hands?)

The Way Forward

This is usually the point where I fire up the catalogue and start speccing out a whole new machine. The spanner in the works this time is that… well, I wasn’t ready for this. The transition between machines is usually managed — all the ephemeral bits and pieces are saved to an external drive, all the programs and saved games are extracted and huddled together for transferal.

On this occasion, though, Helios had died mid-swing. I was literally in the middle of something. Although I could afford to retrace my steps, having a most recent back-up of Exon‘s codebase from a few days earlier, I didn’t really want to do that unless I really had to.

My first instinct was to get a whole new machine and only transplant the hard drives. But then I thought — the graphics card is still good too. And the sound card. And the CD drive and the memory card reader and the cooler and the fans. This system is, by and large, still good. My graphics card, in fact, is actually still a pretty reasonable spec, and I have so far had no trouble playing modern games here.

So I thought, no, it would be a horrible waste to bin all of this because one part was broken. That’s what I did when I was young and feckless. I thought, no, I’m an adult now — I can do this. I can buy a new processor and a new motherboard and I can rebuild him.

We Have The Technology

Some of you may be aware of my insistence with sticking to Windows 7. This put a curious damper on my plan to buy a new processor, because anything “7th gen” and newer would be rejected by Windows 7 — Microsoft ramping up their discontinuing of support for the old operating system. So even if I did get a new processor and motherboard, I would then be plunged into a world of wrangling my Windows installation into working with hardware every fibre of the Microsoft Cloud was telling it to resist.

Things were not looking good.

Then I thought… Can I still buy a 6th gen processor and a motherboard to match? Is there something out there that’s old enough that I might just be able to squeeze through the gap? After all, I am in no particular mood for an upgrade to the system — I only need it to keep on working as before.

There were not many options left on the market, but yes, I found one. I set my sights on an i7 6700K, a bit of a side-grade but otherwise a pretty decent match (my original processor had 6 cores at 3.3GHz; this one has 4 cores at 4GHz. Considering my primary workload of old games that are not threaded, this might actually be an overall upgrade. It was also a 5th gen processor so this one is not quite as old).

Then I found a motherboard that would take a 6th or 7th gen processor, and there were actually several options… So I picked a middle tier board, the Asus PRIME Z270-A; not the cheapest, as I don’t want it to die again tomorrow, but not the most expensive either, because this is effectively obsolete technology. (I had an Asus X99-A before, my hope being that although the chipset is vastly different, keeping to the same manufacturer should mean most of the holes would be in similar places. Oh, my sweet, naive little boy.)

Major Surgery

The great thing about the motherboard is that it’s underneath everything else. While many of the wires are nice plugs with nice sockets, others are tiny metal pins that just protrude into thin air. Motherboards are terrifying.

Removing everything to get at it was a delight. The cable labelled “HD Audio” that connects the sound card to to the front panel was incredibly stiff, to the point where I wondered if it was even meant to come out. The graphics card slot just so happened to be behind the remains of a pop rivet, making it require some pant-wetting contortions to extricate.

But, a huge number of plugs and a case fan later, I had it out. Helios’ guts were arrayed around the studio on every available surface.

This part of the process was irritating but not all that scary. After all, the system was dead and I had nothing to lose by damaging the motherboard, and the rest of the components were fairly secure.

The next step, installing the new processor onto the new motherboard, is where things started to get fun.

I noticed early on that the old motherboard had a large bracket around the processor, where the cooling unit was attached. I noticed that the new motherboard had nothing but four holes in roughly similar positions. All right, fine, we researched this — probably need a backplate. Oh good, there’s one right in the bucket of bolts that Scan left me with. Right it’s… oh no, the holes are in the wrong… Nope, the screws can be slid about a bit, it’s still good. Phew.

Next, the processor had to go in. The processor is terrifying because it’s tiny and weighs hardly anything, and yet it’s the culmination of thousands of man-years of research and manufacturing and, oh yes, it cost me £250. What could possibly go wrong? Just hold it by the sides, hold it by the sides, hold it…

Pulling the latch down afterwards was somehow even worse, because I had no idea how much pressure meant I’d seated it wrong and was about to smash it to pieces versus how much pressure meant it was just being pressed down firmly. I grit my teeth and gently… gently… In, locked, no crunching noises.

Now just to get the cooler on top… the screws are too short? What the… Nope, it’s fine, there’s a slightly different set of screws to use with the backplate.

Sweat sweat sweat sweat sweat. If my pants were merely yellow before that moment, they sure as hell went brown afterwards.

After that, though, it was plain sailing — find the new homes for all those dangling plugs. Of course I took photographs of what came before, but the new motherboard was subtly different in so many ways. The fan plugs hadn’t just moved, they’d been scattered across the whole board. The front USB panel socket had moved away from the power button pins. I had to battle that tight cable management again, slicing through ties to get the extra slack required for the new sockets. (He’s no longer remotely as pretty inside, but as long as he works, who’s looking in there anyway?)

In the end, though, I managed it. Everything was in place.

I flipped the switch.

Rebirthing Pains

It came on. It did nothing except the MemOK! light went on. I had read the manual, I knew this was probably fine. I didn’t quite remember what to do though, so I turned it off and on again.

IT BOOTED FIRST TIME.

Not only did it blast through the BIOS, it went straight to my old Windows install. By all means, that Windows install was on the black “didn’t shut down right” screen, but it was there. I was expecting to have to use Startup Repair to give it a few kicks, but nope, it… Just Worked. Sweet rapture, my gambit of buying older technology had paid off.

It wasn’t quite over yet, though. While Windows was up on the primary hard drive, a solo SSD, I have a two spinning disk drives in a RAID0 setup for my secondary data drive (giant SSDs were prohibitively expensive back then, but I still wanted a chunk of speed). While the fresh motherboard was happy booting into Windows, and the BIOS could see two hard drives plugged in, nobody saw them as the single entity they were supposed to be.

Now I was under the impression that RAID is a motherboard-level thing. My previous machine, Daedalus (named for another AI in Deus Ex), died early on in his career because a power surge knocked the motherboard RAID settings — I had to turn it back on in the BIOS. So I went to the new BIOS and… turned on RAID mode.

Except then the main SSD disappeared. It was not bootable in RAID mode. The BIOS now knew the two secondary drives were merged into one, but it couldn’t handle the solo drive where Windows lived. (I also discovered that I hadn’t plugged the DVD drive back in correctly at this point. Whoopsie!)

This is the point where genuine despair crept back in. After all, the point of this entire enterprise was to recover the system and all its data intact. I had no desire to buy another hard drive, freshly install Windows, and then recover everything manually. I tried Windows Startup Repair from the boot CD but even it had no idea there was anything to repair.

However, I was in no mood to give up, so I put it back into non-RAID mode and booted into Windows again. Maybe I could take the two secondary drives out, put them in an external enclosure and recover them the hard way? They contain less critical data than the main SSD anyway.

Then it happened.

I installed the chipset drivers, the network card drivers, some storage controller drivers and suddenly… it was there. Windows knew it was one drive. It would appear that RAID can be handled completely at the operating system level. Maybe the old setup even did this too; I had never actually checked its settings, because it arrived all ready to go.

Regardless, I had reached the light at the end of the tunnel. Six gruelling hours of fear, uncertainty and delicate ministrations later, Helios was reborn — and I had successfully performed the most frightening bit of computer maintenance possible.

I marked this new lease of life, both for me and for the computer, with the time-honoured ceremonial First Play — Unreal Tournament.

Advertisements

And you tell me...

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.