This a portfolio of some of my professional and personal projects that I have proudly brought to life. I love making web properties, but my passion is in Java software engineering.
Real-Time Multi-Threaded Financial Exchange Data Collector
Dec 2017 – Present
A massive project resulting in over 32,000 LOC1 and over 400 JUnit5 and parametric tests designed to consume real-time financial data as fast as possible from an unforgiving API, store it efficiently, perform ETL in-place, and continuously back up the data off-site. The result is a battle-tested, lean, multi-threaded Java application designed to use 8 cores efficiently, store data on RAID-1 HDDs, transform data for downstream use by Apache Spark, and provide real-time monitoring via Slack updates.
Motivation: To purchase a subset of the above data costs thousands of dollars a month on subscription, is incomplete, lags, and suffers from survivorship bias. Getting minute-resolution data is even more expensive. Collecting the raw financial data directly from the exchange provides us with complete and actionable data for ML modeling.
- Core Java 9 with Maven
- REST API with OAuth
- AWS S3 Java SDK
- Spring Framework
- Logback Framework
- JUnit5, Mockito, Hamcrest
- Websockets (streaming data)
- JMX status monitoring
- Slack status reporting
- MySQL and SQLite (over 200GB)
- Embedded hardware and custom Linux
- RAID-1, continuous backups to S3
Competitive Intelligence Sites Monitor
Dec 2016 – Dec 2018
A fun project designed to monitor competitors for promotions, sales, and new products. Using Spring, this piece of software regularly scrapes other websites, RSS feeds, sitemaps, and various other web touch points looking for changes. Combined with automatic git diff, as well as headless-Chrome based screenshots, significant changes are recorded and sent via Slack to various channels for various competitors.
- Core Java 8 with Maven
- Slack API SDK
- Spring Framework
- Logback Framework
- JUnit, Mockito
- Hibernate ORM
- MySQL and JDBC
- Headless Chrome
Mailing List x Phone Directory Cross-Reference Cleaner
Oct 2008 – Dec 2010
A money-saving Java application that takes a list of up to 10,000 names and mailing addresses and cross-references them with public Canadian phone directories (Canada 411) and other public listings. The result is a list of addressees that are scored on how probable it is that they still live at the given address. Uses multi-threading, proxy servers, bandwidth throttling, and heuristics and pattern matching.
- Core Java 6
- SOCKS 5 proxies
- Hibernate ORM
- Log4J Framework
- MySQL and JDBC
- Java Excel API
Aikido Hombu Timetable App
May 2013 – Jun 2016
A labour of love, the Hombu App, as it is affectionately known, uses heuristic2 schedule parsing to retrieve aikido schedule data from the world headquarters of Aikikai aikido in Tokyo (where I also practiced aikido for 5 years).
This app was desperately needed because the official web site at the time was still a 90s-era site which required up to 4 clicks to get at the schedule for the day, plus it wasn’t mobile friendly. Viewing the schedule required POST form submissions so bookmarking was impossible. On top of that the schedule would change at the drop of a hat, so we would be often surprised by different teachers.
This app solved several problems by using push notifications and a pleasing chime to alert users to schedule changes which were often. By adding pictures of the teachers, scanning the schedule at glance is quick and convenient, unlike the aforementioned 4-click method. Additionally, the back-end remembers teacher changes so visually one can see what changes have been made far into the future as well – up to 60 days, whereas the official schedule is limited to only 14 days.
The main language of users is Japanese, but many visitors speak English. Although told it is impossible to change the app language on-the-fly, I figured out how to do just that – seamlessly change the app language without leaving the app. As time went on, I added video support with animated video thumbnails and video pop-outs so again one doesn’t have to leave the app. Videos can also be saved from YouTube for offline viewing in the app. This technique has worked flawlessly for over three years despite numerous
YouTube API changes.
- Objective-C (iOS app)
- Core Java 7, PHP (servers)
- MySQL DB backend
- Static JSON generation
- Automatic app updater tool
- Weather API integration
- YouTube downloader and inline player
- Database failover to an alternate server
Personal Shell Script
Aug 2016 – Present
If I’m on my iPad or somewhere when I need to get shell-level access to my server files, I’ve got that covered. Here’s a good story: I reverse-engineered a very naughty backdoor shell script3 I chanced upon on Stackoverflow. This thing is so dangerous that my virus scanners go berserk when the source code is pasted into a text editor! Additionally, network OWASP scanners just drop the connection when this script is accessed remotely (and thankfully so). It was quite an adventure just to explore it. In a sandbox, I de-weaponized it and removed the malicious bits (proxy, password brute-forcing,
passwd scanners, etc.), and cleaned up and refactored much of it, plus now it looks pretty as source code. It is now doing something useful for humanity.
Even with all my efforts, OWASP scanners still flagged my initial shell script. That’s because of the base64-encoded payload that is rightfully suspicious. How did I get around eventually eval’ing a massive block of base64 code? That’s a trade secret. I don’t want this method to get out because there are practically zero ways to scan for this and it isn’t suspicious at all.
suup) from anywhere over HTTPS with no stored passwords.
I’ve made a lot of websites using different technologies, from WordPress themes to static sites, to complete AJAX-driven sites. My favorite designs are load-balanced to parallel load assets, have mainly static html files that have been rendered previously with AJAX to fill in dynamic bits, lazy-load images with my custom size-aware scripts, and generally strive to look slick. Here are a few examples.
A massive undertaking involving months of project planning, user journey plotting, data architecting, data warehousing, gigabytes of digital asset management, real-time API communication, and analytics to create a completely new and fast web site to disrupt the luxury train travel space in Canada. I was involved in each stage of planning and designed and built 100% of the backend CMS.
Some of the key challenges I overcame were:
- Managing 17,000+ photos on AWS
- Making WordPress faster with memcache and Redis
- Using in-memory SQLite for fast site searches
- Rapid development using Docker
- Synchronizing the production and development databases
- Extensive customization with ACF4
- Reliable communication with product and pricing API (also built by myself)
Worldwide Brands Canada
This site makes use of 100% static HTML with lazy-loaded image assets, minified CSS and JS, and parallel loading for faster asset downloads. This site is responsive and mobile friendly.
Here are some exciting performance insights into the above site courtesy of tools.pingdom.com. This site is hosted on $6/mo shared host with basic HDD disks, but the performance comes using parallel loading and taking advantage of Cloudflare’s caching ability like a CDN. Google Analytics is the bottleneck.
The site below is in Japanese and makes use of minified CSS and JS, parallel loading and rich media like a responsive HD YouTube video embedded in the frame of a modern MacBook with an animated preview GIF. This is done so the YouTube clip is only loaded when the play button is pressed, yet the animation draws the visitor in to press the play button. In addition, MP3/OGG audio clips are HTML5-embedded into a listening quiz with a final score and advice presented to the quiz-taker. The main CTA (call-to-action) buttons stand out in a pleasing way.
Here are even better performance insights into the above site again courtesy of tools.pingdom.com. This site is hosted on a cheaper $4/mo shared host with basic spinning HDD disks, but the performance again comes using parallel loading and utilizing Cloudflare’s caching capability. Google Analytics is again the bottleneck.
Masakokoro Aikido Dojo
This was one of my first designs made nearly 7 years ago! It’s not updated much by the owner, but it’s still in existence and the graphic elements have held up over time.
Aikikai Aikido Hombu Timetable
This was a site made before the iOS Hombu App was released. It utilizes the same API that the iOS app uses and uses AJAX and jQuery to populate the rows. The rows are static HTML fetched from cache and inserted into the DOM by jQuery. It automatically refreshes every 60 seconds and is in both English and Japanese. It’s simple, but it illustrates the effectiveness of making an API that can be used across the web, iOS, and Android apps.
There are more web sites I have created, like the one for a now-retired Canadian Member of Parliament, and the one for a medium-sized English school in Calgary that went bankrupt two years after I made their beautiful site. There are other sites I’ve contributed to with programming or data-scraping like a coupon site, an affiliate marketing site, a reverse phone number search site, and on.
- 32 KLOC as counted by
find . -name '*.java' | xargs wc -l↩
- Heuristics are needed because different people who oversee the official schedule enter dates differently, and in different formats, and sometimes wildly different (i.e. 2016/5/08, 16-01-5, 2016年５月１２). ↩
- Not this exact script, but a script similar to it. The original post has been deleted. ↩
- Advanced Custom Fields ↩