<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Engineering on Aaron Ang</title><link>https://aaron-ang.github.io/blog/engineering/</link><description>Recent content in Engineering on Aaron Ang</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Fri, 05 Sep 2025 00:00:00 +0000</lastBuildDate><atom:link href="https://aaron-ang.github.io/blog/engineering/index.xml" rel="self" type="application/rss+xml"/><item><title>Building an On-Device Automobile Assistant</title><link>https://aaron-ang.github.io/building-an-on-device-automobile-assistant/</link><pubDate>Fri, 05 Sep 2025 00:00:00 +0000</pubDate><guid>https://aaron-ang.github.io/building-an-on-device-automobile-assistant/</guid><description>&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; I wrote my first blog post two years ago, proud to have done it without any help from LLMs. This time, I have fully embraced them to speed up and sharpen my writing. That said, most LLM-generated text tends to share the same stylistic tone, so copy-pasting isn’t ideal. For technical posts like this, the challenge is balancing clarity with brevity. The core ideas still have to come from the writer, but LLMs provide the vocabulary and structure that make those ideas easier to read. I don’t expect that dynamic to change. Creative work, whether writing or coding, remains an iterative process, where each step refines the alignment between prose and concept.&lt;/p&gt;</description></item><item><title>Building an End-to-End Contract Discovery System</title><link>https://aaron-ang.github.io/building-an-end-to-end-contract-discovery-system/</link><pubDate>Fri, 02 Aug 2024 00:00:00 +0000</pubDate><guid>https://aaron-ang.github.io/building-an-end-to-end-contract-discovery-system/</guid><description>&lt;h2 id="background"&gt;Background&lt;/h2&gt;
&lt;p&gt;Red Hat&amp;rsquo;s Sales and Deal Management team recently faced a significant challenge: migrating &lt;strong&gt;hundreds of thousands&lt;/strong&gt; of documents from our legacy &lt;a href="https://www.salesforce.com/crm/"&gt;Salesforce CRM&lt;/a&gt; to the new &lt;a href="https://www.salesforce.com/sales/"&gt;Sales Cloud&lt;/a&gt; system. This migration is necessary to maintain efficient access to historical contract data, which is crucial for generating new contracts with existing clients and partners. The sheer volume of documents made manual migration impractical, risking millions in potential revenue and increased operational costs due to inefficient contract analysis.&lt;/p&gt;</description></item><item><title>Tinkering with Spark</title><link>https://aaron-ang.github.io/tinkering-with-spark/</link><pubDate>Wed, 19 Jul 2023 00:00:00 +0000</pubDate><guid>https://aaron-ang.github.io/tinkering-with-spark/</guid><description>&lt;h2 id="background"&gt;Background&lt;/h2&gt;
&lt;p&gt;In the summer of 2022, I interned at &lt;a href="https://shopee.com/"&gt;Shopee&lt;/a&gt; as a product analyst on the Search and Recommendation (SnR) data team. My primary responsibility was to deliver reliable and actionable analytics for product managers. Because we frequently ran large-scale queries throughout the day, any job delays or failures directly impacted reporting timelines and slowed progress toward feature improvements or releases.&lt;/p&gt;
&lt;p&gt;Our main tools for data processing and querying were &lt;a href="https://prestodb.io/"&gt;Presto&lt;/a&gt; and &lt;a href="https://spark.apache.org/"&gt;Apache Spark&lt;/a&gt;, supplemented by internal tools that abstracted away much of the underlying engineering complexity.&lt;/p&gt;</description></item></channel></rss>