Learning Python as a Java Developer: Write a Linux RGB-LED CPU Monitor and Daemon

Goal: Let’s write a CPU monitor daemon in Python 3 to smoothly animate an RGB LED according to the CPU load. We’ll use the modified BlinkStick (FadeStick) to offload LED animation processing away from the host CPU. Let’s learn Python 3 from scratch as a Java developer and see what gotchas we encounter.

Given a FadeStick to handle animations itself, let’s write some Python 3 driver code.

Modified BlinkStick to process fades and blink patterns
FadeStick to process fades and blink patterns
Heads up: I’m jumping into Python 3.8 for the first time as a Java developer here, today. Without books nor online courses, let’s figure out Python paradigms and gotchas through trial and error to help other eager Java developers new to Python.

Sections

  1. First Python Observations as a Java Developer
  2. Which Python IDE?
  3. Goal 1: Understand Some Python Code
  4. Goal 2: Write Unit Tests
  5. Goal 3: Run all Unit Tests Everywhere in PyCharm (demo)
  6. Goal 4: Issue USB Color Animation Commands
  7. Goal 5: Interrogate the CPU (demo)
  8. Goal 6: Write a Daemon in Python (demo)
  9. Goal 7: Package the App into a Standalone Executable
  10. Results (source code)
  11. Bonus: Automatic Code Formatting on File Save
  12. Bonus: Inter-Process Communication (IPC)
  13. Bonus: Reduce Standalone Binary Size

First Python Observations as a Java Developer

Here is a Reader’s Digest of my first Python observations as a Java developer.

  • Python is five years older than Java (released in ’91 vs ’96).
  • A char ('') and a string ("") can be mixed and matched, but not in Java.
  • String multiplication is new. E.g. print("a" * 5) results in “aaaaa”.
  • Negative indices wrap around, but just once. E.g. "abc"[-1] is c, but "abc"[-3] is an error.
  • Multiple ways to format strings exist. E.g. print('%d %d' % (1, 2)), print(f'{1} {2}'), …
  • String formatting is a joy to work with in Python 3.
  • Python has Heredoc strings using """. Java got that late in JDK 15.
  • I have no idea how lists, tuples, and dictionaries use memory under the hood, and I’m worried.
  • It will take a while to get used to True and False – Pascal case.
  • if not x % 2: is valid syntax, but if ! x % 2 is not. In Java it’s if( x % 2 != 0 ){...}.
  • Python functions can have default values like PHP and C++. E.g. def println(x=""):.
  • PyCharm wants two blank lines after a class or function definition.
  • Class variables are public by default.
  • There is no new keyword when instantiating new objects (but __new__ exists).
  • self is a parameter that can be passed to class members similar to JavaScript.
  • Class variables can be first “declared” in the __init__ method of a class. Quite JS-like.
  • Displaying dates and times is much more convenient in Python than in Java.
  • Importing classes and functions feels very NodeJS-like.
  • Lists can end with a trailing comma like in TypeScript.
  • Variables and functions are all lowercase. Coming from Java, this feels unnatural.
  • Python uses None instead of null, and automatically returns None without return.
  • Python hints at private methods with an underscore prefix. E.g. def _privateMethod():
  • Empty collections evaluate to False. E.g. assert not [] is valid.
  • The order of imports matters – it’s possible to get circular dependencies.
  • Python has interval comparisons. E.g. if 0 <= number <= 255:. That’s cool.
  • There is no ternary operator (? and ?:).
  • Lambdas exist, kind of. Observations are here.
  • Python uses named parameters (ordered by default). Love this.
  • Class variables are also called “fields” like in Java.
  • Try-with-Resources is accomplished with the with keyword. Nice.
  • Python can return (yield) from a try-raise then come back to the finally. Not in Java.
  • There is no switch statement in Python – use if-elif.
  • Serializing “objects” is called pickling. De-serializing is called _unpickling_1.

Let’s go from for-loops right to our featured project with no ramp-up.


Which Python IDE?

Which IDE to use for this project? Books and tutorials seem to favor Sublime Text. Let’s use JetBrains PyCharm Professional in Linux. I’ve not used it before, but it’ll probably be my new favorite IDE because JetBrains consistently makes my development life convenient and enjoyable. We have breakpoints, jumping to methods, Code With Me, Docker integration, and much more.

Hello, what is this PyCharm Edu?

PyCharm Edu logo

Some helpful people put together courses on Python which run right in PyCharm, and it even integrates into Coursera online learning. Thoughtful.


Goal 1: Understand Some Python Code

Most of the boilerplate is done for us, so we just have to extract the IO features for USB communication of the BlinkStick, update the code for Python 3, and add in a lot of custom driver code, CPU monitoring code, and daemonize the app. First, let’s reverse engineer BlinkStick-Python to understand what a Python app looks like.

This ~1600 LOC looks readable enough. Let’s write a main file that checks if the FadeStick is in the USB.

Importing modules: Why can’t we import usb.core?

Here a first attempt to access a USB device.

There are underlines in PyCharm asking if we want to import usb.core and usb.util. Research shows that if we add a requirements.txt to the project root with the following entry, the problem is solved.

The “hello world” script now succeeds with:

Variable case: All variables and functions are lowercase.

Coming from C++, Objective-C, and Java, camelcase is the only case that gives me pride in my work. Let’s turn off these case errors in PyCharm.

Camelcase causes a warning in PyCharm

File structure: How to organize Python code and classes?

Coming from Java (and this being day one of Python), let’s use camelcase and place each class is in its own file. This should make unit testing cleaner, right?

Python file organization as a Java developer
Python file organization as a Java developer
Constants: Why are constants not part of Python?

No idea. However, as of Python 3.8 (Oct 2019), there is typing.Final so we can write constants like so:

Type hinting: How to type hint dynamically-typed variables and return types?

As of Python 3.5 (Sept. 2015), we can take advantage of type hinting like so.

Additionally, and this is like TypeScript, we can hint multiple types as well. For example:

toString(): How to print a collection of objects?

We can use the magic method __repr__ in each object to achieve this. If __str__ is not defined, then a str(my_object) call will use __repr__. Here is an example:


Goal 2: Write Unit Tests

My FadeStickUSB.py USB abstraction class is getting complex. Let’s write some unit tests in Python, but how?

PyCharm helpfully auto-generates a unit test stub, but we have to name it. From research, many people prefer to prefix or affix unit tests with test_ or _test.py and place them in the same folder as the source code. Coming from Java, we know that a dedicated tests folder holds JUnit tests (which I’m told inspired Python unit tests). Why the same folder as code? I’m told it makes refactoring safer. Here is a generated stub.

Circular reference: How to avoid circular imports like: “ImportError: cannot import name ‘openUSBDevice’ from partially initialized module ‘core.FadeStickUSB'”?

We’ve discovered that the order of imports matters, and that the file name should not normally be the same as the class name (unlike in Java), or else the import may import the file, not the class in the file, by accident. This is more of a human error and happens in NodeJS as well. See below.

Import a class or a file

Alternatively, imports can be local, meaning we can import modules and methods in the middle of a class or function. This is so very odd coming from C++ and Java.

Another pattern to prevent circular imports due to type-hinting is to add the following at the beginning of a file (until it is enabled by default in Python 4). Why? Type hints are a fantastic addition to Python, but custom objects as type hints need to be resolved/imported, so that can lead to circular imports. Yikes.

Yet another pattern is to prevent offending imports due to type hinting until runtime with a bit of logic.

Exceptions: Should Python methods raise an exception, or return None?

In Java, we throw specialized exceptions left and right to control flow. I suspect the Pythonic way is to return None. But, you know what? Personally, it feels clean and satisfying to throw a specialized exception with a helpful message than return None (null, in Java).

The same as above, but with a helpful exception:

Function overloading: How to overload functions based on parameter type-hinting?

Apparently, functions are not first-class citizens, so many code writers use guards and if-else statements to accept dynamic (non-typed or Any) parameters instead of writing polymorphic functions. My instinct is to write polymorphic methods/functions like this.

Unsuccessful function overloading

Let’s see if anyone else prefers to overload functions as well… and they do, sort of. Python uses decorators from community libraries to achieve @overload, @dispatch, @multimethod, and more. Here is a readable solution that works well for me using parameter types in the annotation.

and

Yes, in Python, functions can float around a script outside a class, but the order of declaration matters. It feels like a .py file acts like a filename-based namespace for its functions.

Variable scope: Do variables auto-hoist like in JavaScript?

It looks like we can instantiate a variable in a try-catch block (try-except in Python), and said variable is available outside the try block. This is auto-hoisting in JavaScript. This is personally worrisome, especially since JS introduced the let keyword to specifically to prevent auto-hoist-related bugs. Here is an example.

Annotations: How easy is it to make our own annotations in Python?

Java uses annotations. Python uses decorators to modify method functionality that resembles Java annotations. Let’s make our own @disabled annotation (decorator) to mark Python unit tests as disabled.

Then in unit tests, we can disable tests easily with a custom message as well.

Test generators: How to write parameterized unit tests in Python?

Java has JUnit parameterized tests to use test values from files or some collection of input-expected pairs. Having just discovered lambdas exist in Python, and poring over the unit test framework source code, here is my first attempt at parameterized unit testing in Python with a lambda. We see that we need to bind outer variables to lambda parameters.

And the output is similar to this.

Finally, Python has test class/method setup and teardown patterns too. For instance,

Let’s move on to the next goal now that we can write unit tests effectively in Python.


Goal 3: Run all Unit Tests Everywhere in PyCharm

PyCharm isn’t like IntelliJ IDEA Pro where there is a convenient green arrow button to run all JUnit tests in all folders. From research, unless all the test files are in the same folder, PyCharm will not test them all as a suite.

Test suites: How to run all test files everywhere as a suite in PyCharm?

In PyCharm, one cannot simply left-click on the project root folder and hope the “run tests” button will appear in the context menu.

We could drop to the shell and execute the unit tests with a recursive search pattern, but then we lose the satisfying visual green bar that grows across the IDE showing us all our tests have passed.

Let’s try something cool: let’s write a script to symlink and synchronize tests throughout the project to a git-ignored tests folder in which we can run all the tests in one go. The result looks like the following.

Each time the following script is run, the previous symlinks are cleared and new symlinks are created to keep tests in sync. Here is the script in the project root.

We can now run all tests in one click in PyCharm.


Goal 4: Issue USB Color Animation Commands

The next goal is to make the LED change simple colors.

Ternary operator: Where is the ternary operator in Python?

There is no ternary operator (? or ?:), so the following causes several syntax errors.

The Python way to write a similar ternary statement is:

Lambdas: Lambdas act like mathematical closures in Python. How extensible are lambdas in Python?

Lambdas in Python can be used to postpone the execution of a method. Consider the following.

Since Python 3.8 (Oct 2019), we can actually instantiate temporary variables with := by enclosing the lambda in a tuple () and making each statement an entry in said tuple. Also, the lambda has to be of the form () -> None, so PyCharm warns about returning a tuple from the lambda. Also, local variables need to be bound – do not try to use a local variable directly in the lambda.

It feels like a hack to use a tuple to execute code in a lambda, but here is the most Java-like lambda I could make to test some colors.

Named tuples: How to use named tuples with validation?

Let’s use an RGB class that holds red, green, blue as integers, is immutable, and has in-built validation. We could use a class with an __init__ method and some Final type-hints, but let’s take this opportunity to write an immutable named tuple called RGB.

The above worked great for a while until I wanted to add validation and unit test the above tuple. It then became the following because NamedTuple isn’t actually a class; it’s a function.


Implement Original BlinkStick Methods

Instead of copying the original, verbose, Python 2 BlinkStick code, let’s rewrite it more simply with type hinting. For compatibility, let’s make a BlinkStick class that inherits from FadeStick to keep the CPU-sleep animation logic separate from the FadeStick code. Here we go.

The original code has been re-written and condensed into the above lines.

Assert statements: Asserts can be turned off at runtime so they are not reliable in production. Is there a Python way to validate parameters?

Unlike, say, SpringBoot annotations (e.g. @Max(255), @NotEmpty), Python doesn’t have parameter validation annotations or decorations. We could always use interval evaluations, but they would be needed for each parameter in each method. Can we do better?

Until I know better, I rolled my own int class (with unit tests) that allows for some range logic and helpful exceptions. Since everything in Python is an object, class RangeInt(int): is a valid construct, quite unlike Java.

Then we can do something like Java’s val = Objects.assertNotNull(val) in Python.


Implement New FadeStick Methods

Because we don’t want BlinkStick and FadeStick to share the same methods, it would be nice to implement some abstract methods in a base class I will call FadeStickBase.

Abstract methods: Why are there no abstract or virtual methods in Python?

Right away we see that Python doesn’t support abstract methods, so the following, while incomplete, permits the given class to be instantiated.

I’m told the Pythonic way to achieve abstract methods is the following:

However, coming from SpringBoot and desiring the base class to not be instantiated due to the “abstract” methods, the following does the trick.

We can now go ahead and split BlinkStick and FadeStick into classes inheriting from FadeStickBase with mutually exclusive LED methods.

However, not yet being satisfied, here is the technique I will actually use going forward to get the best of both worlds: an @abstract annotation that raises Pythonic exceptions. As a bonus, the method name that caused the exception is in the message.

Let’s implement (read: override) those abstract methods according to the Java counterpart.

Increment operators: Do increment operators like x++ and ++x work as expected?

Given the following sample code, you may be surprised to discover that the ++i doesn’t increment i and the 0th array entry is repeatedly updated.

Why? There is no unary increment operator in Python. In Python, ++i == i the same way +1 = 1, so ++1 == 1. Rats. Here is a workaround using enumerate to also get the ith position.

The new FadeStick asynchronous pattern methods become thusly:


Goal 5: Interrogate the CPU

In Linux, we can query the processing time of the CPU from /proc/stat directly without a third-party module like psutil or a program like htop. One caveat is that /proc/stat returns the processing time presumably since boot, not the near-instantaneous CPU time we need. A wonderful solution on StackOverflow led to the strategy of checking the CPU times (processing and idle) periodically and calculating the CPU use percent during a given slice.

Then, by performing the above cat programmatically in a loop, with a bit of math and string formatting we can see something resembling the following:

An infinite loop with a sleep command to drive the LED color is a nice demo. Let’s use stress-ng to simulate a very high load on the CPU and show just the htop bars with htop -u nobody.

Here is the result.


Goal 6: Write a Daemon in Python

Can we write daemons in Python agnostic of Systemd and BusyBox, and without backgrounding a do-wait-while script that may generate CPU soft-lock warnings? It turns out we can. Here is a short proof-of-concept that spawns a singleton daemon, so if this same script is called repeatedly, only one daemon will be spawned.

This is a nice toy example, but if this is going to be robust, let’s use the defacto standard library python-daemon. Why? We need signal handling to communicate with the daemon (e.g. stop, restart), as well as have the daemon log to Syslog. You’d be forgiven for thinking debugging a daemon (detached process – the PyCharm debugger won’t follow it) is easy, so we need Syslog.

When os.fork() is executed and the main thread terminates, logging pipes are closed so the daemon is in Plato’s cave. With a bit of fancy coding, we can keep those file descriptors open as follows.

Preserve the Logger in the Daemon

Here is a way that works to preserve the logging to Syslog and Stdout (just the main thread). I only post this here because it took a while with trial and error to engineer a solution if it helps anyone.

First, here is the logging setup. The trick which is important later is to attach the Syslog handler to the root logger and then create a named logger daemon. This is due to what variables survive a fork, and the root logger does.

Next, we have to tell the daemon context to keep the Syslog socket open on fork.

Now the daemon can log to /var/log/syslog.

Starting the Daemon

With a daemon context available, we can start the daemon process like so:

The output from the starting the daemon then looks like the following.

Console

Syslog

Using tail -F /var/log/syslog:

Stopping the Daemon

How to stop a detached process that is running on its own? Should we ps aux | grep pytho[n], find the zombie PID, and kill -9 ${PID}? We can be more sophisticated and graceful with stopping (and restarting) our daemon: send a signal programmatically.

Then how does the daemon catch the signal? Back to the context method, we can pass in a signal map so a prescribed method in the daemon runs on a given signal.

When the _end() method runs, let’s toggle some flag that ends a while-true loop so the daemon then gracefully terminates.

Be careful: If the daemon is already running and we modify the code above and re-run the parent code, the daemon does not have the same methods and code; it has the _run() and _end() code from the first time it was forked.

Goal 7: Package the App into a Standalone Executable

As it stands, we have three modules we imported in requirements.txt. Are these the only external requirements?

Standalone executable: How to package a Python project into a single file the way Java does with JAR files, but without installing Python?

There some suggestions from the community: PyInstaller and Nuitka.

PyInstaller

Let’s give PyInstaller a shot from the PyCharm terminal.

We can see that a 6.8 MB 64-bit ELF executable was generated. The moment of truth is here. Let’s execute the daemon.

Nothing happens and the folder SHA1 changes between invocations. Let’s figure out why the SHA1 changes with some debug messages.

Ah, venv is using temp folders per execution. Let’s consult Syslog next with tail -f /var/log/syslog:

Darn. We’re having USB problems too. Let’s execute the standalone binary in a system shell outside of PyCharm and check the Syslog.

I’m less than enamored with PyInstaller. Let’s explore Nuitka next.

Nuitka

Docker in Linux integrates very well in JetBrains products, so let’s have some fun making a Nuitka, Cython, and AppImage build pipeline to compile a standalone executable as a new Python+Docker skill.

Gotcha: Nuitka compiles a Python project into an executable and libraries in a folder, then AppImageKit compacts that into a single, portable binary. AppImageKit requires fuse, which is a deal-breaker for Docker projects.

After safely sidestepping the no-fuse-in-Docker issue (without elevating permissions), and reverse-engineering the hidden requirements to make this Dockerized tool run, we’ve made progress. Here are the issues we overcame:

  • Fuse is needed in Docker
  • An icon in XPM format is required
  • The file command is required
  • Run Python and Pip as a non-root user

When built, the main executable in the resulting AppDir folder of libraries runs perfectly, but the AppImage standalone binary doesn’t. Let’s visit syslog.

No idea. We need to timebox this. Let’s import _posixsubprocess explicitly and move on.

This goes on a few more times until these manual imports are added.

Run. Failure. Syslog again.

This is getting silly and points to a systemic problem. Actually, these libraries _posixsubprocess.so, _hashlib.so, etc. are present in the Nuitka build folder. Strange. Here is a Hail Mary pass with --appimage-extract-and-run:

And, it works! This must be either a fuse issue, or some issue with AppImageKit running in a Docker container. Good enough for now so we can move on. Below is a demo of the problem and the workaround.

One small note: With the extracted libraries this small daemon expands from 9MB to 25MB on disk.

Gotcha: Examine the compiled libraries to see what is taking up the most space. For example, the use of a simple SHA1 function triggers the inclusion of a 2.9MB crypto library.

Nuitka, Cython, AppImageKit Dockerfile

Here is the Dockerfile that powered the previous section. Bind your work folder to /workdir and change your uid:gid if they are not both 1000.


Results

With the background of a Java developer learning Python for the first time and an ambitious project in mind, in a few days, we learned how to set up PyCharm, type-hint, write unit tests to ensure correctness throughout, interact with the USB bus, write a system daemon, make a build pipeline in Docker directly in PyCharm, and deliver a portable binary to display a morphing color corresponding to the load of all the CPUs. Here is the source code.

Success: We’ve used a motivating project to learn Python 3.8+ from scratch and overcame many of the language’s gotchas when coming from Java.

Bonus: Automatic Code Formatting on File Save

Want to collaborate well with others? Want to follow 2001’s PEP 8 coding style for Python and make (keep) programming friends? Pressing ctrl+a (select all), ctrl+alt+l (reformat), and then ctrl+alt+o (optimize imports) works. However, with either PyCharm and the BlackConnect (plus blackd) or Save Actions plugin, you can do just that automatically. Here are the steps:


Bonus: Inter-Process Communication (IPC)

Let’s see what our daemon is up to. How is the CPU load now? Is it even running? Is the FadeStick plugged in? Let’s ask it.

Inter-process communication (IPC) is super simple a concept as it is, but let’s see if Python makes it easy to implement: we just need a named pipe and a lot of error handling.

This code sends a SIGUSR1 signal to the daemon similar to how we sent a SIGINT to ask it to terminate. It then polls a named pipe (which is just an ordinary file on disk with special attributes), and times out after five seconds if no reply is received (crashed daemon?).

The daemon, registered for SIGUSR1, writes its status (just a string of text, but we could serialize (pickle) anything) to the named pipe and does not block.

There are some details omitted, like how a string is converted to ASCII bytes, and I added a lock because I would synchronize in Java, but essentially we read and write to a shared named pipe for IPC in Python too. Neat.


Bonus: Reduce Standalone Binary Size

Remember this?

Let’s see if we can get away without the crypto library, Unicode, and CJK codecs to shave off several megabytes from the standalone binary.

Notes:

  1. I cannot say pickling and unpickling with a straight face.