
Processing receipts might sound like a small operational detail. But when you handle them at scale – across different companies, formats and languages – it quickly becomes a significant question of business efficiency.
That was the situation our client, Taksikuutio found themselves in. Their core business is offering financial management for taxi companies and public service operators in Finland: this means their system needs to automatically read and interpret taxi receipts coming in from various sources.
Both customers and taxi drivers need to apply for expense reimbursements in different scenarios, and usually send a photo of the receipt to achieve this. Needless to say, the results are often all over the place, with faint text or unexpected layouts.
At first, this seemed like a problem we could solve with traditional parsing. But anyone who has worked with receipts knows how quickly that approach can become cumbersome.
The classic way to automate text extraction is by using regular expressions – sets of rules that look for patterns in text. These work great when the format is predictable. But with dozens of receipt templates and layouts, each new format means new exceptions to handle. Maintaining the system becomes an unsustainable game of whack-a-mole.
Knowing this, Taksikuutio needed something that could gracefully handle the high number of variations.
Instead of adding more rules, we decided to teach the system to understand what it was looking at. The process started by using optical character recognition (OCR) to extract text from receipt images. We then passed this text to a large language model, which knew how identify things like company names, VAT numbers and where the total amount and tax portion of the bill appears. Importantly, this solution could achieve all this without needing a separate rule for every possible layout.
To make the results reliable, we defined a clear data model: every piece of information had a precise place to go. The AI filled in as many of those fields as it could from the OCR text, producing a clean, structured output.
In practice, this immediately made a massive difference. Instead of dozens of brittle parsing scripts, Taksikuutio now had one intelligent process that could handle new receipt formats out of the box. The model knows how to adapt to new layouts with no extra development required.
The new system both reduced manual work and made data flows more reliable. Error rates dropped, maintenance became easier, and the business could process far more receipts without adding development overhead.
In other words: we turned what used to be an increasingly complex rule-based process into a flexible, future-proof system that will serve Taksikuutio for years to come.
The project also highlights a wider shift. As AI tools become more capable, the way we design automation is changing. Instead of trying to cover every possible case with hand-crafted logic, we can now build systems that better understand exceptions and variations. In the long run, the foundation can be used to create even more sophisticated tools – essentially turning everyday documents into rich and actionable business data.