Over the years, I’ve repeatedly heard that Windows NT is a very advanced operating system and, being a Unix person myself, it has bothered me to not know why. I’ve been meaning to answer this question for years and I can do so now, which means I want to present you my findings.

My desire to know about NT’s internals started in 2006 when I applied to the Google Summer of Code program to develop Boost.Process. I needed such a library for ATF, but I also saw the project as a chance to learn something about the Win32 API. This journey then continued in 2020 with me choosing to join Microsoft after a long stint at Google and me buying the Windows Internals 5th edition book in 2021 (which I never fully read due to its incredible detail and length). None of these made me learn what I wanted though: the ways in which NT fundamentally differs from Unix, if at all.

Then, at the end of 2023, the Showstopper book sparked this curiosity once again. And soon, a new thought came to mind: the Windows Internals 5th edition book was too obtuse but… what about the first edition? Surely it must have been easier to digest because the system was much simpler back in the early 1990s. So lo and behold, I searched for this edition, found it under the title Inside Windows NT, read it cover to cover, and took notes to evaluate NT vs. Unix.

Which brings me to this article—a collection of thoughts comparing the design of NT (July 1993) against contemporary Unix systems such as 4.4BSD (June 1994) or Linux 1.0 (March 1994). Beware that, due to my background, the text is written from the point of view of a Unix “expert” and an NT “clueless”, so it focuses on describing the things that NT does differently.

A blog on operating systems, programming languages, testing, build systems, my own software projects and even personal productivity. Specifics include FreeBSD, Linux, Rust, Bazel and EndBASIC.

0 subscribers

Follow @jmmv on Mastodon Follow @jmmv on Twitter RSS feed

Mission

Unix’s history is long—much longer than NT’s. Unix’s development started in 1969 and its primary goal was to be a convenient platform for programmers. Unix was inspired by Multics, but compared to that other system, Unix focused on simplicity which is a trait that let it triumph over Multics. Portability and multitasking were not original goals of the Unix design though: these features were retrofitted in the many “forks” and reinventions of Unix years later.

On Microsoft’s side, the first release of MS-DOS launched in August 1981 and the first release of “legacy Windows” (the DOS-based editions) launched in November 1985. While MS-DOS was a widespread success, it wasn’t until Windows 3.0 in May 1990 that Windows started to really matter. Windows NT was conceived in 1989 and saw the light with the NT 3.1 release in July 1993.

This timeline gave Microsoft an edge: the design of NT started 20 years after Unix’s, and Microsoft already had a large user base thanks to MS-DOS and legacy Windows. The team at Microsoft designing NT had the hindsight of these developments, previous experience developing other operating systems, and access to more modern technology, so they could “shoot for the moon” with the creation of NT.

In particular, NT started with the following design goals as part of its mission, which are in stark contrast to Unix’s:

  1. portability,
  2. support for multiprocessing systems (SMP), and
  3. compatibility with DOS, legacy Windows, OS/2, and POSIX.

These were not goals to scoff at and meant that NT started with solid design principles from the get go. In other words: these features were all present from day one and not bolted on at a later stage like they were in many Unixes.

The kernel

Now that we know some of these design goals and constraints, let’s take a look at the specifics of how they are implemented.

Unix is, with few exceptions like Minix or GNU Hurd, implemented as a monolithic kernel that exposes a collection of system calls to interact with the facilities offered by the operating system. NT, on the other hand, is a hybrid between a monolithic kernel and a microkernel: the privileged component, known as the executive, presents itself as a collection of modular components to user-space subsystems. The user-space subsystems are special processes which “translate” the APIs that the applications consume (be it POSIX, OS/2, etc.) into executive system calls.

One important piece of the NT executive is the Hardware Abstraction Layer (HAL), a module that provides abstract primitives to access the machine’s hardware and that serves as the foundation for the rest of the kernel. This layer is the key that allows NT to run on various architectures, including i386, Alpha, and PowerPC. To put the importance of the HAL in perspective, contemporary Unixes were coupled to a specific architecture: yes, Unix-the-concept was portable because there existed many different variants for different machines, but the implementation was not. SunOS originally only supported the Motorola 68000; 386BSD was the first port of BSD to the Intel architecture; IRIX was the Unix variant for Silicon Graphic’s MIPS-based workstations; and so on. This explains why NetBSD’s main focus on portability via a minimal shim over the hardware was so interesting at the time: other operating systems, except NT, did not have this internal clean design, and NT had come years before.

Another important piece of the NT executive is its support for multiprocessing systems and its preemptive kernel. The kernel has various interrupt levels (SPLs in BSD terminology) to determine what can interrupt what else (e.g. a clock interrupt has higher priority than a disk interrupt) but, more importantly, the kernel threads can be preempted by other kernel threads. This is “of course” what every high-performance Unix system does today, but it’s not how many Unixes started: those systems started with a kernel that didn’t support preemption nor multiprocessing; then they added support for user-space multiprocessing; and then they added kernel preemption. The latter is the hardest step of all and explains the FreeBSD 5.0 saga fiasco. So it is interesting to see that NT started with the right foundations from its inception.

Objects

NT is an object-oriented kernel. You might think that Unix is too: after all, processes are defined by a struct and file system implementations deal with vnodes (“virtual nodes”, not to be confused with inodes which are a file system-specific implementation detail). But that’s not quite the same as what NT does: NT forces all of these different objects to have a common representation in the system.

You can rightfully be skeptical about this because… how can you offer a meaningful abstraction over such disparate things as processes and file handles? You can’t, really, but NT forced all of these to inherit from a common object type and, surprisingly, this results in some nice properties:

  • Centralized access control: Objects are exclusively created by the object manager, which means there is a single place in the code to enforce policy. This is powerful because the semantics for, say, permission checks, can be defined in just one location and applied uniformly throughout the system. NetBSD concluded this was a good idea too, but it wasn’t until 2001 that it gained its Kernel Authorization (kauth) framework.

  • Common identity: Objects have identities and they are all represented in a single tree. This means that there is a unique namespace for all objects, no matter if we are talking about processes, file handles, or pipes. The objects in the tree are addressable via names (paths) and different portions of the tree can be owned by different subsystems. For example, a portion of the tree can represent a mounted file system, and thus traversing that subtree’s root node will cause the file system to resolve the remainder of the path. This is akin to the VFS layer of a Unix system, with the difference that the VFS is exclusively about file systems whereas the object tree is about every single kernel object. It’s true that Unix has attempted to shoehorn other types of non-file objects into the file system via /proc/, /sys/, and the like—but these feel like afterthoughts compared to what NT offers.

  • Unified event handling: All object types have a signaled state, whose semantics are specific to each object type. For example, a process object enters the signaled state when the process exits, and a file handle object enters the signaled state when an I/O request completes. This makes it trivial to write event-driven code (ehem, async code) in userspace, as a single wait-style system call can await for a group of objects to change their state—no matter what type they are. Try to wait for I/O and process completion on a Unix system at once; it’s painful.

Objects are an NT-specific construct though, and they don’t generalize well to all of the APIs that NT intended to support. An example of this is the POSIX subsystem: POSIX doesn’t have the same concept of objects as NT does, yet NT has to offer some sort of compatibility with POSIX applications. For this reason, while the POSIX subsystem allocates objects from the executive, this subsystem must keep its own bookkeeping to represent the corresponding POSIX entities and performs the logical translation between the two on the fly. The Win32 subsystem, on the other hand, just hands objects to clients without an intermediary.

Processes

Processes are a common entity in both NT and Unix but they aren’t quite the same. In Unix, processes are represented in a tree, which means that each process has a parent and a process can have zero or more children. In NT, however, there is no such relationship: processes can “inherit” resources from their creators—any type of object, basically—but they are standalone entities after they are created.

What wasn’t common back when NT was designed were threads: Mach was the first Unix-like kernel to integrate threads in 1985, which means that other Unixes adopted this concept later on and had to retrofit it into their existing designs. For example, Linux chose to represent threads as processes, each with its own PID, in its 2.0 release in June 1996; and NetBSD didn’t get threads, represented as separate entities from processes, until its 2.0 release in 2004. Contrary to Unix, NT chose to support threads from the very beginning, knowing that they were a necessity for high-performance computing on SMP machines.

NT doesn’t have signals in the traditional Unix sense. What it does have, however, are alerts, and these can be kernel mode and user mode. User mode alerts must be waited for as any other object and kernel mode alerts are invisible to processes. The POSIX subsystem uses kernel mode alerts to emulate signals. Note that signals have often been called a wart in Unix because of the way they interfere with process execution: handling signals correctly is a really difficult endeavor, so it sounds like NT’s alternative is more elegant.

An interesting recent development in NT-land has been the introduction of picoprocesses. Up until this feature was added, processes in NT were quite heavyweight: new processes would get a bunch of the NT runtime libraries mapped in their address space at startup time. In a picoprocess, the process has minimal ties to the Windows architecture, and this is used to implement Linux-compatible processes in WSL 1. In a way, picoprocesses are closer to Unix processes than native Windows processes, but they are not used for much anymore—even if they have only existed since August 2016—because of the move to WSL 2.

Lastly, as much as we like to bash Windows for security problems, NT started with an advanced security design for early Internet standards given that the system works, basically, as a capability-based system. The first user process that starts after logon gets an access token from the kernel representing the privileges of the user session, and the process and its subprocesses must supply this token to the kernel to assert their privileges. This is different from Unix where processes just have identifiers and the kernel needs to keep track of what each process can do in the process table.

Compatibility

As mentioned in the introduction, a major goal of NT was to be compatible with applications written for legacy Windows, DOS, OS/2 and POSIX. One reason for this was technical, as this forced the system to have an elegant design; the other reason was political, as NT was a joint development with IBM and NT had to support OS/2 applications even if, in the end, NT ended up being Windows.

This need for compatibility forced NT’s design to be significantly different than Unix’s. In Unix, user-space applications talk to the kernel directly via its system call interface, and this interface is the Unix interface. Oftentimes, but not always, the C library provides the glue to call the kernel and applications never issue system calls themselves—but that’s a minor detail.

Contrast this to NT where applications do not talk to the executive (the kernel) directly. Instead, each application talks to one specific protected subsystem, and these subsystems are the ones that implement the APIs of the various operating systems that NT wanted to be compatible with. These subsystems are implemented as user-space servers (they are not inside the NT “microkernel”). Support for Windows applications comes from the Win32 server, which is special because it’s the only one that’s directly visible by users: it controls console programs and DOS terminals, and it has certain privileges for performance reasons.

Compared to traditional Unix, NT’s design is very different because the BSDs and Linux have a monolithic kernel. These kernels expose a system call interface that userspace applications leverage to interact directly with the system. The BSDs, however, have offered support to run alternate binaries for a long time, all within the monolithic kernel: the way this works is by exposing different system call tables to userspace depending on the binary that’s being run, and then translating those “foreign” system calls to whatever the kernel understands. Linux has limited support for this as well via personalities.

Even though the BSD approach is quite different from how NT handles supporting other systems, WSL 1 is extremely similar and is not a subsystem in the literal terms that subsystems were originally defined. In WSL 1, the NT kernel marks Linux processes as picoprocesses and, from there on, exposes a different system call interface to them. Within the NT kernel, those Linux-specific system calls are translated into NT operations and served within the same kernel—just like BSD’s Linux compatibility does. The only problem is that, NT not being Unix, its “emulation” of Linux is tricky and much slower than what BSD can offer. It’s a pity that WSL 2 lost the essence of this design and went with a full-on VM design…

To finish this section, two more interesting details: a goal of NT’s design was to allow seamless I/O redirection between subsystems, all from a single shell; and subsystems are exposed to applications via ports which are, of course, NT objects and are similar to how Mach allows processes and servers to communicate.

Virtual memory

NT, just as Unix, relies on a Memory Management Unit (MMU) with pagination to offer protection across processes and to offer virtual memory. Paging in user-space processes is a common mechanism to give them a larger address space than the amount of physical memory on a machine. But one thing that put NT ahead of contemporary Unix systems is that the kernel itself can be paged out to disk too. Obviously not the whole kernel—if it all were pageable, you’d run into the situation where a resolving kernel page fault requires code from a file system driver that was paged out—but large portions of it are. This is not particularly interesting these days because kernels are small compared to the typical installed memory on a machine, but it certainly made a big difference in the past where every byte was precious.

Additionally, while we take the way virtual memory and paging works these days for granted, this was a big area of research back when NT was designed. Older Unix implementations had separate memory caches for the file system and virtual memory, and it wasn’t until 1987 that SunOS implemented a unified virtual memory architecture to reduce the overheads of this old design.

In contrast, NT started with a unified memory architecture from the beginning. You’d say that this was easy to do because they had the hindsight of the inefficiencies found in Unix and could see the solution that SunOS had implemented before the design of NT started. But regardless, this made NT “more advanced” that many alternate operating systems back then, and it has to be noted that other systems like NetBSD didn’t catch up until 2002 with the implementation of the Unified Buffer Cache (UBC) in NetBSD 1.6.

An interesting difference between NT and Unix is how they manage and represent shared memory. In NT, shared memory sections are (surprise) objects and are thus subject to the exact same access validation checks as any other object. Furthermore, they are addressable in the same manner as any other object because they are part of the single object tree. Contrast this to Unix where this feature is bolted on: shared memory objects have a different namespace, a different API to every other entity, and thus typical permissions don’t apply to them.

I/O subsystem

Early versions of Unix only supported one file system. For example, it wasn’t until 4.3BSD in 1990 that the BSDs gained the Virtual File System (VFS) abstraction to support more than just UFS. NT, on the other hand, started with a design that allowed multiple file systems.

In order to support multiple file systems, the kernel has to expose their namespaces in some way. Unix combines the file systems under a single file hierarchy via mount points: the VFS layer provides the mechanisms to identify which nodes correspond to the root of a file system and redirects requests to those file system drivers when traversing a path. NT has a similar design even if, from the standard user interface, file systems appear as disjoint drives: internally, the executive represents file systems as objects in the object tree, and each object is responsible for parsing the remainder of a path. Those file system objects are remapped as DOS drives so that userspace can access them. And guess what? The DOS drives are also objects under a separate subtree that redirects I/O to the file systems they reference.

In file system terms, NT ended up shipping with NTFS. NTFS was a really advanced file system for its time even if we like to bash on it for its poor performance (a misguided claim). The I/O subsystem of NT, in combination with NTFS, brought 64-bit addressing, journaling, and even Unicode file names. Linux didn’t get 64-bit file support until the late 1990s and didn’t get journaling until ext3 launched in 2001. Soft updates, an alternate fault tolerance mechanism, didn’t appear in FreeBSD until 1998. And Unix represents filenames as nul-terminated byte arrays, not Unicode.

Other features that NT included at launch were disk stripping and mirroring—what we know today as RAID— and device hot plugging. These features were not a novelty given that SunOS did include RAID support since the early 1990s, but what’s interesting is that these were all accounted for as part of the original design.

At a higher level, the thing that makes the I/O subsystem of NT much more advanced than Unix’s is the fact that its interface is asynchronous in nature and has been like that since the very beginning. To put this in perspective, FreeBSD didn’t see support for aio(7) until FreeBSD 3.0 in 1998, and Linux didn’t see this either until Linux 2.5 in 2002. And even if support for asynchronous I/O has existed in Unix systems for more than 20 years now, it’s still not widespread: few people know of these APIs, the vast majority of applications don’t use them, and their performance is poor. Linux’s io_uring is a relatively recent addition that improves asynchronous I/O, but it has been a significant source of security vulnerabilities and is not in widespread use.

Networking

The Internet is everywhere today, but when NT was designed, that was not the case. Looking back at the Microsoft ecosystem, DOS 3.1 (1987) included the foundations for file sharing in the FAT file system, yet the “OS” itself did not provide any networking features: a separate product called Microsoft Networks (MS-NET) did. Windows 3.0 (1990) included support for NetBIOS, which allowed primitive printer and file sharing on local networks, but support for TCP/IP was nowhere to be seen.

In contrast, Unix was the Internet: all foundational Internet protocols were written for and with it. During the design of NT, it was therefore critical to account for good network support, and indeed NT did launch with networking features. As a result, NT did support both Internet protocols and the traditional LAN protocols used in pre-existing Microsoft environments, which put it ahead of Unix in corporate environments.

An an example, take NT’s network domains. In Unix, network administrators typically synchronized user accounts across machines by hand; they’d maybe use the X.500 directory protocol (1988) and Kerberos (1980s) for user authentication, which systems like SunOS implemented, but these technologies weren’t particularly simple. Instead, NT offered domains from the get go, which integrated directory and authentication features, and it seems to me that these “won the day” in corporate networks because they were much easier to set up and were built into the system.

The goal of synchronized user accounts is to share resources across machines, primarily files, and when doing so, the way to represent permissions matters. For the longest time, Unix only offered the simplistic read/write/execute permission sets for each file. NT, on the other hand, came with advanced ACLs from the get go—something that’s still a sore spot on Unix. Even though Linux and the BSDs now have ACLs too, their interfaces are inconsistent across systems and they feel like an alien add-on to the design of the system. On NT, ACLs work at the object level, which means they apply consistently throughout all kernel features.

And speaking of sharing files, we must talk about networked file systems. In Unix, the de facto file system was NFS, whereas on NT it was SMB. SMB was inherited from MS-NET and LAN Manager and is implemented in the kernel via a component called the redirector. In essence, the redirector is “just” one more file system, like NFS is on Unix, that traps file operations and sends them over the network, which brings us to comparing RPC systems.

Even though protobuf and gRPC may seem like novel ideas due to their widespread use, they are based on old ideas. On Unix, we had Sun RPC from the early 1980s, primarily to support NFS. Similarly, NT shipped with built-in RPC support via its own DSL—known as MIDL to specify interface definitions and to generate code for remote procedures—and its own facility to implement RPC clients and servers.

Moving down the stack, Unix systems have never been big on supporting arbitrary drivers: remember that Unix systems were typically coupled to specific machines and vendors. NT, on the other hand, intended to be an OS for “any” machine and was sold by a software company, so supporting drivers written by others was critical. As a result, NT came with the Network Driver Interface Specification (NDIS), an abstraction to support network card drivers with ease. To this day, manufacturer-supplied drivers are just not a thing on Linux, which leads to interesting contraptions like the ndiswrapper, a very popular shim in the early 2000s to be able to reuse Windows drivers for WiFi cards on Linux.

Finally, another difference between NT and Unix lies in their implementation of named pipes. Named pipes are a local construct in Unix: they offer a mechanism for two processes on the same machine to talk to each other with a persistent file name on disk. NT has this same functionality, but its named pipes can operate over the network. By placing a named pipe on a shared file system, two applications on different computers can communicate with each other without having to worry about the networking details.

User-space

We are getting close to the end, I promise. There are just a few user-space topics to briefly touch on:

  • Configuration: NT centralized system and application configuration under a database known as the registry, freeing itself from the old CONFIG.SYS, AUTOEXEC.BAT and the myriad INI files that legacy Windows used. This made some people very angry, but in the end, a unified configuration interface is beneficial to everyone: applications are easier to write because there is a single foundation to support, and users have an easier time tuning their system because there is just one place to look at.

    Unix, on the other hand, is still plagued by dozens of DSLs and inconsistent file locations. Each program that supports a configuration file has its own made-up syntax, and knowing which locations the program reads is difficult and not always well-documented. The Linux ecosystem has pushed for a more NT-like approach via XDG and dconf (previously GConf) but… it’s an uphill battle: while desktop components use these technologies exclusively, the foundational components of the system may never adopt them, leaving an inconsistent mess behind.

  • Internationalization: Microsoft, being the large company that was already shipping Windows 3.x across the world, understood that localization was important and made NT support such feature from the very beginning. Contrast this to Unix where UTF support didn’t start to show up until the late 1990s, and supporting different languages came via the optional gettext add-on.

  • The C language: One thing Unix systems like FreeBSD and NetBSD have fantasized about for a while is coming up with their own dialect of C to implement the kernel in a safer manner. This has never gone anywhere except, maybe, for Linux relying on GCC-only extensions. Microsoft, on the other hand, had the privilege of owning a C compiler, so they did do this with NT, which is written in Microsoft C. As an example, NT relies on Structured Exception Handling (SEH), a feature that adds try/except clauses to handle software and hardware exceptions. I wouldn’t say this is a big plus, but it’s indeed a difference.

Conclusion

NT was groundbreaking technology when it launched. As I presented above, many of the features we take for granted today in systems design were present in NT since its inception, whereas almost all other Unix systems had to gain those features slowly over time. As a result, such features don’t always integrate seamlessly with Unix philosophies.

Today, however, it’s not clear to me that NT is truly “more advanced” than, say, Linux or FreeBSD. It is true that NT had more solid design principles at the onset and more features that its contemporary operating systems, but nowadays… the differences are blurry. Yes, NT is advanced, but not significantly more so than modern Unixes.

What I find disappointing is that, even though NT has all these solid design principles in place… bloat in the UI doesn’t let the design shine through. The sluggishness of the OS even on super-powerful machines is painful to witness and might even lead to the demise of this OS.

I’ll leave you with the books used to write this article in case you want to go through my learning journey. I had to skip over tons of interesting details, as you can imagine, so these are worth a read:

And if you want to continue my journey and truly dive deep into how each piece of modern NT and Unix works, the newer editions are a must read: