OS & CPU Architecture

Oct.06.2017 | 8m Read | ^DevOps

Operating Systems and the CPU work together in software development for multi-processing, distributed deploy, embedded systems, memory (ie, kernel space), and more. Here we'll start at the beginning from system off and cover all the way to GPU-based Machine Learning. Let's dive in!

Pre-Boot:

▼ BIOS (Basic Input/Output System):

    • ☑⁡⁡ Instructions communicate with each other using the CMOS settings.
      ☑⁡⁡ Flash memory based hardware.
        Flash memory typically lasts 10 years, resetting the BIOS and your settings to default every restart.
        ◆⁡⁡ Wears out due to data overwrites.
  • ▼ CMOS (Complementary Metal Oxide Semiconductor):

    • ☑⁡⁡ A motherboard (or mainboard which others connect to) chip that stores BIOS instructions.
      ☑⁡⁡ Uses flash memory (ie, 64 bytes).
      ☑⁡⁡ Many boards have a backup CMOS for alternate settings and in case the of data corruption.
        ◆ Set via jumper pin(s) [removable plastic covers placed over board pins].
  • ▼ Memory Registers (Random Access Memory [RAM]):

    • ☑⁡⁡ Are now loaded and power management is started.

    ▼ POST (Power-On Self-Test):

    • ☑⁡⁡ Tests hardware components to determine the system's boot fitness.
      ☑⁡⁡ Feedback may be relayed via audible beep code (ie, a single short for pass).
        ◆ Or LED light blinks/colors.
  • Boot[strap load]:

    ▼ Boot[strap] Loader [program]:

    • ☑⁡⁡ On successful POST, the system 'bootstraps' or pulls itself together.
        ◆ The ROM (Read-Only Memory) device and memory location where the OS loading instructions are accessed (ie, boot sector on a hard disc).
        ◆ The core of the OS (the kernel) is loaded and starts itself.
  • O[perating] S[ystem]:

    ▼ Device Drivers

    • ☑⁡⁡ Various drivers (hardware instructions) are loaded into a combo of Kernel Mode/Space and User Mode/Space.
      ☑⁡⁡ Kernel Mode/Space: More performant, high-access and privileged processes.
      ☑⁡⁡ User Mode/Space: Protected and less privileged access processes. Stable and slow.
        ◆ NOTE: moving data between spaces and modes is slow and hardware intensive.
      ☑⁡⁡ Virtual Device Drivers: emulated hardware (ie DOS, VM ware) have access to device systems like IRQs.
  • ▼ Process Management:

    • ☑⁡⁡ Multitasking/programming:
        ◆ Parallel processing:
          ☑ Multi-core processors.
          ☑ Bridging processors.
          ☑ Distributed systems (nodes).
        ◆ Specialized Bus architecture
          ☑ Pipes optimized for parallel data flow.
      ☑ Pipes/Units optimized for parallel micro-instructions.
        ◆ Interleaving processes.
      ☑⁡⁡ Process Creation:
        ◆ Process batches.
        ◆ Process node parent/child spawning.
      ☑⁡⁡ Process Termination:
        ◆ Batch halt instructions.
        ◆ Hardware errors.
        ◆ Software errors.
        ◆ User instructions.
        ◆ Process completed.
      ☑⁡⁡ Two-state process model:
        ◆ RUNNING.
        ◆ NOT RUNNING.
      ☑⁡⁡ Three-state process:
        ◆ RUNNING.
        ◆ READY (queued).
        ◆ BLOCKED (requires event handling to change state).
      ☑⁡⁡ Five-state process: swapping or 'suspending' states into buffers for efficiency:
        ◆ RUNNING.
        ◆ READY (queued).
        ◆ BLOCKED (requires event handling to change state).
        ◆ READY SUSPEND (READY process loaded into a swap buffer).
        ◆ BLOCKED SUSPEND (BLOCKED process loaded into a swap buffer).
      ☑⁡⁡ Interrupt Requests (IRQs, aka 'Interrupts'):
        ◆ The systems that manages and interleaves device processes with system processes.
          IRQ lines send messages from hardware to the CPU (Central Processing Unit) to 'interrupt' or take priority.
            ◆ Interrupt Handler: the program that loads the interrupt and processes it.
            ◆ Used for various inputs keyboard, mouse, hard drive, sound, video, etc.
            IRQs are numbered and can be set in the BIOS to avoid conflicts between devices.
            ◆ Newer devices may have dedicated Interrupt Controllers and/or support IRQ sharing.
            ◆ 'Plug and Play' is the system of auto-configuring IRQs when devices are inserted.
  • ▼ File system

    • ☑⁡⁡ Provides a directory/folder structure:
        ◆ Uses tables and/or nodes for look-up.
        ◆ Examples include NTFS and FAT (Windows), HFS and HPFS (Apple), EXT, XFS (Linux).
        Metadata: Miscellaneous data for tagging and managing files (ie attributes on Windows).
      ☑⁡⁡ User Interface (UI): provides tools for manipulating files:
        ◆ Text based: uses typed commands with typing device (ie, keyboard or touchscreen).
        Graphical User Interface (GUI "goo-ee"): uses a pointer device (ie, a mouse or touchscreen).
  • ▼ Memory Management

    • ☑⁡⁡ Virtual Memory: page swapping hard disk ROM with RAM to increase the available resources for processing.
        ◆ Levels of memory:
          ☑⁡⁡ OS memory: pre-cached and makes the OS and UI more responsive.
          ☑⁡⁡ ⁡Application memory: garbage collection exited processes.
            ◆ Improves application performance if the OS has plenty.
  • ▼ I[nput]/O[utput]

    • ☑⁡⁡ Memory mapped I/O interfaces and abstractions for the Bus and IRQs:
        ◆ Channel I/O: software based DMA with more options.
        ◆ Port-mapped I/O: port number assignment and management.
  • ▼ Networking

    • ☑⁡⁡ Software based networking hardware: topology, routing, packet management, port mapping, protocol configuration.
    • ☑⁡⁡ Providing a visual interface for networking hardware.

    ▼ Security

    • ☑⁡⁡ Integrity:
        ◆ Vulnerability patch fixes.
      ☑⁡⁡ Integrated tools:
        Malware removal.
        Firewall/net traffic monitoring and control.
        ◆ Network encryption.
      ☑⁡⁡ Filesystem:
        ◆ Local encryption.
        ◆ User/privilege metadata.
      ☑⁡⁡ UI for software tools:
        ◆ IE, Metasploit Framework for Pen[etration] testing.
      ☑⁡⁡ UI for hardware tools:
        ◆ IE, Oracle database hardware based firewall.
  • CPU Architecture:

    ▼ CPU Concurrency (Synching or Timing Operations):

    • ☑⁡⁡ Single Core/non-distributed:
        ◆⁡⁡ Programmed Threads and Bus architecture divide up the work.
          ☑⁡⁡ [bus] bridge[s]: hardware connecting multiple buses.
        ◆⁡⁡ Absent such hardware, the cpu can interleave a schedule of instructions to share and manage resources.
      ☑⁡⁡ Multi-core/distributed systems:
        ◆⁡⁡ Threads and servers divide up the work load per node or core.
      ☑⁡⁡ Overclocking Processors (ie, CPUs & GPUs) by setting the clock speed higher:
        ◆⁡⁡ Can cause concurrency to drop, leading to more errors and worse performance.
        ◆⁡⁡ Also causes OS freezes/crashes, heat damage, screen flickers, 'random' behavior, and errors.
        ◆⁡⁡ Processor overclocking settings and performance depends on the specific model and hardware modding abilities (ie, increasing cooling).
  • ▼ Assembly language & Machine language:

    • ☑⁡⁡ Machine language (code): a low level language (consisting of binary or hex).
        ◆ Higher level languages compile down into this.
        ◆ Generally this is the level just above vendor based hardware instructions or microcode.
      ☑⁡⁡ Assembly language: assembles machine code.
        ◆ Uses opcodes to manage and interface with hardware instructions.
        ◆ Ideal for specialized hardware tasks that removes much of the cruft and safeguards.
        ◆ Generally the lowest level a human programs outside of hacks and analog methods.
  • ▼ GPUs (Graphics Processing Units) in Machine Learning & AI (Artificial Intelligence):

    • ☑⁡⁡ Offloading intensive processing to more specialized hardware and APIs (ie, Vulkan, Direct3D, OpenGL) that can also efficiently and cost-effectively link together.
      ☑⁡⁡ GPU cards typically have faster RAM (Random Access Memory) known as VRAM or Video RAM.
        ◆ nVidia in particular has Tensor Cores that operate as AI hardware first and foremost.
        ◆ nVidia DGX model workstations/servers comprise multiple cards that specialize in AI processing.
        ◆ nVidia Jetson is a specialized and relatively cheap palm-sized computer for AI training.
        ◆ nVidia NGC now has cloud based servers for GPU AI training.
        ◆ nVidia NGX features hybrid cloud technology.

  •        : NEWS