Big Data Backup to S3 Glacier via Java SDK

Why use AWS Glacier for big data backup? It’s exceedingly inexpensive to archive data for disaster recovery on Glacier. AWS Glacier is only US$0.004 per GB/mo, and their SDK is beautiful. Here I outline a pricing matrix for cloud storage providers, and I take a look at the Java SDK for working with AWS Glacier to effectively archive 200GB a week.

DNS Block Malicious Ads and Scripts: Pi-Hole

Ads are getting more and more aggressive. Some ads are even malicious. Some sites even load crypto-currency mining scripts in the background in JavaScript. Users have discovered that a lot of traffic is advertisement or tracking scripts, and putting a damaging strain on our mobile device data plans and batteries. Here I explain how to safely setup Pi-hole – a network-wide ad-traffic blocker – in a Docker container on an external Linux device as a hardware DNS server to block ads.

Profile Remote Java Apps with VisualVM or JMC

Sometimes remote Java apps leak memory or are killed by the OS. Let’s connect through an SSH tunnel to a remote JVM running on an embedded Ubuntu system and profile memory and CPU usage with free tools VisualVM and JStatD, or Java Mission Control. No firewall adjustments are needed. We’ll also set up JMX connections to allow remote heap dumps and garbage collection. Finally, I’ll explore the features of VisualVM.

Troubleshooting Electronics: Voltage Discrepancy

Breadboard power supplies cost less than a dollar on AliExpress. They are quite convenient for quickly powering and prototyping microprocessor circuits, Arduino projects with sketches, USB-powered prototypes, and on. The imagination is the limit. I spent the morning trying to figure out why my MB102 breadboard power supply was outputting only 3.5V, not the expected 5.0V.

Experience with Inexpensive MicroSD Cards

For the cluster computing project I’m working on, I need 28 microSD cards. There was an AliExpress sale with good reviews, so I ordered a batch of 30 microSD cards, and at a great price point at the time. As long as the cards are Class 10 and work then we should be good, right? Results: Half are fake or defective. The rest are painfully slow. No refunds.

Cleaning Raw Candle Data for Time Series Analysis

Problem: How to clean the raw OHLCV candle data from the broker for time series analysis? Suppose we have an autonomous program that prioritizes and continually downloads the latest minute and day candles, as well as periodically gets new symbols from the broker. The problem is that the candles are not guaranteed to be full-period […]

Automated Web Testing with PHP and Selenium

Things break. Just the other day through a series of seemingly unrelated events, a new Microsoft x509 certificate made its way into a security handshake process which went unnoticed until current single sign-on sessions began to expire. Had we also had automated security testing, we would have caught this one-off. I’ll explain how I set […]

Algorithm: Optimized PDF Web Page Print Layout

I’d like to share my efforts to prevent page breaks in the middle of paragraphs and maximize the use of page space when printing web pages to PDF. I’ll outline how this PHP+NodeJS+Chrome tool and algorithm accomplish this. The motivation is to prevent pictures from being cut off, cut halfway through, or from being pushed […]

Running Xvfb on a RHEL Shared Host (without X)

This is how I compiled the Xorg Server for RHEL on a CentOS machine with modifications to create a portable Xvfb binary. Xvfb (X virtual framebuffer) is an in-memory display server for Linux and Unix-like OSes. It enables running graphical applications without a display such as running a headless browser (e.g. A full-blown Firefox instance […]

Mitigating AdWords Click Fraud

Every now and then there is an hours-long campaign of fraudulent AdWords-clicking from countries all over the world, ranging from Iran to Singapore, dedicated to clicking my cost-per-click Google ads in a vain attempt to exhaust a given daily budget early. My hat goes off to the chap for organizing the attack, or at least […]

Posted in: Problem StoriesTags: AdWords, Cloudflare, Google, GTM, Htaccess

Pokémon Go Scanner

The inspiration to make my own Pokémon Go scanner came from this great site FastPokeMap.se (and Twitter feed). Try this site first before venturing out to make your own scanner. It’s a neat site, but unfortunately each scan is slow takes upwards of 20 seconds, and the failure rate is high. It’s strength comes from […]