Why use AWS Glacier for big data backup? It’s exceedingly inexpensive to archive data for disaster recovery on Glacier. AWS Glacier is only US$0.004 per GB/mo, and their SDK is beautiful. Here I outline a pricing matrix for cloud storage providers, and I take a look at the Java SDK for working with AWS Glacier to effectively archive 200GB a week.
This is a problem story about how I preferred Java to other languages to communicate with a troublesome financial REST API endpoint because Java is a strongly-typed and verbose language where it is easy to write unit tests and build up solid modules to make a complete, resilient project.
Normally data packets come and go on the same interface, but VPN routing causes response packets to return through the tunnel and are dropped as unsolicited traffic – the connection hangs until a timeout. This makes it difficult to SSH into a server with an active VPN connection, but I explain a way do just that.
Sometimes remote Java apps leak memory or are killed by the OS. Let’s connect through an SSH tunnel to a remote JVM running JDK 11 on an embedded Ubuntu system and profile memory and CPU usage with free tools VisualVM and JStatD. No firewall adjustments are needed. We’ll also set up JMX connections to allow remote heap dumps and garbage collection. Finally, I’ll explore the features of VisualVM.
Let’s setup a remote Docker daemon on AWS and connect to it securely over HTTPS with PhpStorm. This will allow us to develop and administer Docker containers remotely with the PhpStorm IDE. Here we’ll create TLS certificates, configure the Docker daemon, verify the setup, and configure PhpStorm to even use Docker Compose remotely.
Sometimes you need to be alerted when a website string is present or absent. Here I outline a quick and easy method to alert you to such changes specifically in JSON data using just a free Chrome browser extension. I then showcase two case studies how this has helped me with retail shopping and domain name purchasing.
Breadboard power supplies cost less than a dollar on AliExpress. They are quite convenient for quickly powering and prototyping microprocessor circuits, Arduino projects with sketches, USB-powered prototypes, and on. The imagination is the limit. I spent the morning trying to figure out why my MB102 breadboard power supply was outputting only 3.5V, not the expected 5.0V.
Given a cluster computing rig of twenty-eight processors, each can have either a USB 2.0 or microSD local flash storage. Which type of flash and maker is the fastest? Make the wrong choice and the cluster is painfully slow. Not all microSD cards or USB drives are made the same, and interestingly random read and write speeds vary wildly. Here I test several storage configurations with striking benchmark results.
Markdown is used all the time, for example in GitHub readme.md files, in Slack messages, and in WordPress themes. The HTML produced by rendering Markdown has no class or id attributes, and cannot be nested in HTML tags. How can we style an individual Markdown element? Let’s use a neat CSS trick to easily do just that.
My newer-model Panasonic microwave oven stopped working. To get it working I needed to get past anti-tamper screws and “special” fuses. I suspect Panasonic wants us to buy another microwave instead. Not this time!
For the cluster computing project I’m working on, I need 28 microSD cards. There was an AliExpress sale with good reviews, so I ordered a batch of 30 microSD cards, and at a great price point at the time. As long as the cards are Class 10 and work then we should be good, right? Results: Half are fake or defective. The rest are painfully slow. No refunds.
Let’s build a 112-core 1.2GHz A53 cluster with 56GB of DDR3 RAM and 584GiB of high-availability distributed file storage, running at most 200W. The goal is to use cluster computing to perform fast Apache Spark operations on Big Data, and all on-prem for a fraction of what cloud computing costs.
Problem: How to clean the raw OHLCV candle data from the broker for time series analysis? Suppose we have an autonomous program that prioritizes and continually downloads the latest minute and day candles, as well as periodically gets new symbols from the broker. The problem is that the candles are not guaranteed to be full-period […]
Before I acquire time-series stock candles, I need to know the storage footprint, database architecture, and how to deal with multi-threaded concurrency of the databases. How large could a single database grow? Each candle has OHLCV data plus a unix timestamp in seconds (Note: I chose this timestamp structure to aid in cleaning the candles […]
This would make a good interview question: There are about 120,000 public North American securities, bonds, rights, and index symbols. You have a paid API that can access all of them in OHLCV format if they are quotable. There are two critical API constraints: 15,000 calls per hour 30 calls per second Napkin math Minute […]
Things break. Just the other day through a series of seemingly unrelated events, a new Microsoft x509 certificate made its way into a security handshake process which went unnoticed until current single sign-on sessions began to expire. Had we also had automated security testing, we would have caught this one-off. I’ll explain how I set […]
With PHPUnit I need to mock a class that is declared deep, deep in an arrangement of other classes. I’ll show you how easy it is with class_alias() and the right PHPUnit annotations. Background A given class spins up a headless Chrome instance. I don’t want to fire up and close a real Chrome instance […]
Here are few PHP web shell scripts I found in a production server in late 2016. I’ll show some of them, sneaky as they be, and then my efforts for securing a production server.
I’d like to share my efforts to prevent page breaks in the middle of paragraphs and maximize the use of page space when printing web pages to PDF. I’ll outline how this PHP+NodeJS+Chrome tool and algorithm accomplish this. The motivation is to prevent pictures from being cut off, cut halfway through, or from being pushed […]
GoAccess web log analyzer is a beautiful tool to show real-time traffic and stats – including GeoIP information, bandwidth usage, and visitor time distributions – of my web projects and apps over and above Google’s Webmaster Tools and UA reporting. At a glance I can see current traffic and historic traffic just by adding a […]
Here’s how to go about debugging, stepping through, and profiling remote code like a breeze. These are the steps I took to install/enable Xdebug on a remote LAMP stack and debug/profile hosted code using PhpStorm and a Chrome extension. As a bonus I’ll share how I debug cURL requests with Xdebug too. 1. Setup remote […]
Among friends let’s agree we’ll be privately caching videos and not permanently saving them, or we’ll be using them for Fair Use, and we’ll certainly not upload nor share these videos outside of the originating platform (e.g. YouTube.com). Existing YouTube downloader scripts: YouTube-Downloader (does not work with videos using a cipher signature) YouTube video downloader […]
These are the steps I took to compile Firefox so it can run on a RHEL shared hosting server which doesn’t have D-Bus installed and only has GLibc 2.12. Situation You want to run headless Firefox on a shared host running RHEL You don’t have privileged access (e.g. no root) Your shared host only has […]
This is how I compiled the Xorg Server for RHEL on a CentOS machine with modifications to create a portable Xvfb binary. Xvfb (X virtual framebuffer) is an in-memory display server for Linux and Unix-like OSes. It enables running graphical applications without a display such as running a headless browser (e.g. A full-blown Firefox instance […]
The goal is to run Linux binaries on a shared host where we do not have root access, and there are no package managers installed (because it’s shared hosting). The shared libraries need to be copied over manually. Here is how we can do it. If you’re like me, you might have a (few) shared […]
When getting started with LINE API messaging, you need to know the mid of a message recipient. It’s not his/her username. It’s a string that looks like ub8dbd4a12c322f6c0118883d839c55a4. LINE utilizes a callback URL that you can set for your trial LINE bot. At this endpoint you can place a script, shown below, which will report […]
Sometimes I want to share a large file on my site without tying up bandwidth. If I don’t intend for the file to be downloaded often, I can offload the work to Dropbox and use PHP or htaccess to share a convenient URL. https://yoursite.com/project/psdfile/index.php https://yoursite.com/project/psdfile.psd You can get the direct download link for a Dropbox […]
Every now and then there is an hours-long campaign of fraudulent AdWords-clicking from countries all over the world, ranging from Iran to Singapore, dedicated to clicking my cost-per-click Google ads in a vain attempt to exhaust a given daily budget early. My hat goes off to the chap for organizing the attack, or at least […]
The inspiration to make my own Pokémon Go scanner came from this great site FastPokeMap.se (and Twitter feed). Try this site first before venturing out to make your own scanner. It’s a neat site, but unfortunately each scan is slow takes upwards of 20 seconds, and the failure rate is high. It’s strength comes from […]
Using a fabulous WordPress plugin called External Links to put little arrow icons beside external links was the task this morning. Installation and setup took less than 60 seconds and all worked well. But wait, there are arrow icons in my header navigation links. Surely as per the documentation I can add the noicon class […]
Sometimes one of my sites is under attack from a click-fraud campaign. I needed to devise a way to detect such an attack and instantly and automatically change my Cloudflare security level from ‘medium’ to ‘under attack’. When in under-attack mode, Cloudflare performs additional browser checks to filter out robots. It doesn’t stop all the […]
That was really a really high bridge – 70m – and the river was shallow beneath us. There was no one else there. The team of jump masters was ready for just us two. It was amazing, like we had the whole valley to ourselves, Alex and I. We wanted an adventure, something we’d never […]
I use PHPStorm daily and it’s nice to remember the shortcuts. Sitepoint has a great run down of productivity shortcuts. I’m reproducing my favorite ones here. This comes straight from Sitepoint. PhpStorm remembers multiple clipboard contents – you can press CTRL+SHIFT+V to summon a popup which lets you paste clipboard content that’s less recent than […]