Cluster Computing Archives

Clusterboard A64 Insidious Reset Problem: Solved

EricMarch 26, 2021May 8, 2021

A bare-metal compute node may soft-lock, spin-lock, deadlock, overheat, encounter resource starvation, the Docker daemon goes away, systemd becomes unstable, and on. In these cases, a watchdog timer acting like a dead man’s switch is not updated, a timer reaches zero and the watchdog circuit restarts all the hardware. However, the clusterboard A64 SoCs have a WDT reset problem which we solve satisfyingly.

Cluster Computer Gotchas: U-Boot, Device Trees, iPXE

EricMarch 10, 2021April 16, 2021

Locate in one place all the hardware and software gotchas I’ve encountered from compiling and running the U-Boot bootloader to fixing timing issues in a cluster computer of Allwinner A64 SoCs in order to help others with similar issues and remember myself.

Embedded Linux Completely from Scratch

EricFebruary 23, 2021February 28, 2021

Each node of my cluster computer is nameless and stateless like an AWS Lambda, so the entire OS must reside in memory. Having explored minimal Debian, Alpine Linux, and even RancherOS, the most exciting conclusion is to learn to compile the embedded Linux kernel and bootloaders from scratch for ARM64 and learn how to network-boot bare-metal hardware over HTTP.

Attempts at Supplying Efficient Logic-Level Voltage with no Decay to Sensitive Electronics

EricOctober 10, 2019September 30, 2020

Power supply logic-level waveform featured

A power supply, when suddenly turned off, bleeds voltage slowly. Attached electronics experience a gradual voltage decline from 5V to 3.3V and eventually to zero. The problem is that microcontrollers and microprocessors don’t know how to behave with under-voltage. Their behavior and flash memory integrity is not defined. Flash memory can even be erased. Here I outline my attempts to achieve an efficient logic-level power supply.

Cluster Computing – Benchmarking Local Storage

EricDecember 11, 2018April 14, 2019

Cluster Computing – Choosing Local Storage

Given a cluster computing rig of twenty-eight processors, each can have either a USB 2.0 or microSD local flash storage. Which type of flash and maker is the fastest? Make the wrong choice and the cluster is painfully slow. Not all microSD cards or USB drives are made the same, and interestingly random read and write speeds vary wildly. Here I test several storage configurations with striking benchmark results.

Cluster Computing – Hardware Planning

EricSeptember 19, 2018April 26, 2022

Cluster computing 3D rendering featured image

Let’s build a 112-core 1.2GHz A53 cluster with 56GB of DDR3 RAM and 584GiB of high-availability distributed file storage, running at most 200W. The goal is to use cluster computing to perform fast Apache Spark operations on Big Data, and all on-prem for a fraction of what cloud computing costs.