What would happen, if your computer blown up NOW?

By   2015-08-22

What would be a really easy day became an exhausting whole day marathon without a guarantee of success. A user reported various problems with her computer account. After doing some short homework I learned that this user had a typo in the account name, that got corrected, but lots of things were screwed up.

I’ve copied the data away from the user profile folder and deleted the account on the computer. I then allowed the user to log in so I could later copy the data back. And here’s where it stared: the profile was being marked as temporary which meant it would be deleted after log-off! Ouch!

A search engine quickly pointed to the registry and a call to a colleague confirmed this. Deleting the references to the old user-name didn’t help, renaming a .bak profile to normal didn’t as well. The only way to solve the problem within an acceptable time frame was to give the user a new laptop, so the right profile would get generated on the first go and copy the data on a shared drive for the user to access later.

Escher, but without the awesome factor

What does this story taught me?

It reinforced my opinion, that the Windows registry is an insane asylum ran by the patients. Nothing is obvious there and one simple thing (like a log-in name) gets saved in a bazillion places and makes correcting stuff harder. Moreover, the registry is a SPOF (single point of failure). Mess up one key and the whole system goes down. It’s so internally complex and dense it’s sometimes (or maybe more often than sometimes) easier to just replace the machine and reinstall the system than waste your and the users time on a quest without an end.

And now, for a little more scientific breakdown. When a user is accessing an account on a Windows machine at home, she/he logs into that computer, to what we call a local user account. That account lives only on that computer and if the machine blows up, the account with it (I know, you have backups, Windows 8+ does a cloud sync for you with the Online type of account).

In organizations bigger than just a handful of people (like a big international company) it would be insane to have only local accounts. If every computer is an island and its inhabitants do whatever they please, it’s impossible to enforce any kind of access control, security, compliance and all that other Black Mesa stuff that blows up in your face one day.

Have a safe and productive day!

What happens when you log into a company computer is: you don’t log into the computer, you log into a server called a domain controller and the user you have is a domain user. That’s a real far cry from a local account, since the account is stored in a service called Active Directory and if the computer blows up the most important settings (from the sysadmin point of view, not the user point of view) get re-downloaded and installed again on the new machine. You’ll still lose your files though, but at least the admin can set a coherent wallpaper on a 1000+ computers (yay!).

(OK, actually there is a full solution called a roaming profile, but this will absolutely smash your network with its bandwidth requirements).

In the very end a portion of that data ends up, one way or another, in the registry. The word “directory” in AD is important. AD is an example of a Directory Service: it holds data to be used by people over the network. What’s even more important is: Microsoft did build and owns AD, but AD itself is an implementation of a far wider concept called LDAP.

What has this to do with the problem my user faced?

Change my name I remain the same…

Every user in an LDAP database is an object. All “things” in LDAP are called objects (like printers, computers, folders [called organizational units]). Every object has a UUID (Universal Unique ID) that allows for (near) perfect identification and this ID is absolutely constant for the object during its lifetime. If you’re an admin working with AD and you’ve seen “funny digits” in folder permissions instead of what should be a group or user-name: that’s exactly the UUID exposed.

The UUID is what we could call “name”. There’s a chair, its UUID is 00-000-00 (UUIDs are much longer, but I’m not going to poke my readers in the eyes with these digits). The chair can get new paint or a better cushion, but its UUID never changes. We can call this chair “Chair Foo” and rename it “Chair Bar”. Change the name but the UUID will remain the same.

That’s why when someone gets married, all accesses within Windows Server/AD (or storage solution/LDAP solution) work as they used to, the system recognizes the object by UUID, not by “human friendly” names like John or Mr. Brainsample.

What happened, was this: when Miss Foo logged into her computer for the first time, Windows took some of her details found in AD and built a profile. One of the most important parts of the profile is the profiles name (that’s where all your stuff usually resides if you’re using only the C:\ drive). You can check it yourself. If on Windows 7 go to C:\Users\ and you’ll see your profile name like Lancelot.

Mrs. Foo logged in, Windows created the folder MissFoo on her PC, so by the time the error was spotted, it was too late. Her name should be Miss Bar and her account name MissBar. The typo got corrected: the object 00-000-01 was renamed from Miss Foo to Miss Bar. But lots of stuff stayed on the endpoint machine as it was, the UUID got saved somewhere in the registry, probably all over the place.

In the end you might ask: TD, if you’re such a smart-ass, why didn’t you go into the registry and fix it, you could extract the UUID and look for it! My answer:

  • I’m not an expert on AD/regedit and (more importantly, even if I were):
  • When a user is left without work tools because of what should be an easily correctable error I’m *not* going to spend hours on masturbating my own IT ego and;
  • I’m not wasting my time on fixing the mentioned what-should’ve been-an easily correctable error, because it’s:
    • Tons faster to replace the hardware;
    • I can invest that time into yelling out rants like this one 😉

The whole hardware swap was a classic workaround: I didn’t solve the core problem. The solution would be this:

DELETE the goddamn object 00-000-01,

Create a new user, that would get UUID 00-000-02

Ask the user to log in, shoot more stuff in Unreal Tournament, call it a day.

Because once a UUID is created, it propagates everywhere. And if something is broken, why waste your time on fixing it? I’m looking at you, Windows.