Our week in review: Performance Optimisation and Production Readiness (week 13)

The week everything got slower (and then faster)

Week 46 started with a problem: property pages were taking 4-5 seconds to load.

Not the database queries. Those were fast—104ms to fetch property data. But the overall page load? Painfully slow.

Users were noticing. "Is something wrong with the site? Pages are really slow today."

Nothing had changed. Well, nothing except we now had dozens of properties each with 10-15 photos. And every photo needed thumbnail generation. And I'd been generating those thumbnails on-demand.

This week was about making LetAdmin production-ready. Not adding features—fixing performance, adding monitoring, handling edge cases discovered through real usage. The boring but essential work that separates "functional prototype" from "production system."

The image generation problem I'd been ignoring

Here's what was happening: user clicks property page, server fetches property data (fast!), page renders with 12 photos, each photo needs a 600x600 thumbnail, Active Storage generates thumbnails on-demand taking 900-1,300ms per image.

So property data loads in 100ms, but the page takes 4-5 seconds to become fully usable because thumbnails are processing in real-time.

The fix? Preprocess image variants in background jobs. When photos upload, generate all variants asynchronously. Property pages serve pre-generated thumbnails instead of generating on-demand.

Result: property pages now load in under a second. That's 80% improvement. Users stopped complaining about slow pages.

The interesting part? I knew this would be a problem eventually. I just hoped "eventually" wouldn't arrive so soon. But with real agencies uploading hundreds of photos, "eventually" was now. Performance problems don't wait for convenient timing.

The monitoring I should have built earlier

Second problem: I didn't know about the slow photo generation until users complained. No monitoring, no alerts, no visibility into performance problems.

This week I built proper monitoring infrastructure: automatic detection of slow database queries over 100ms, request timing for every controller action, email alerts when performance degrades, rake tasks to analyze logs and suggest missing indexes.

Now when something slows down, I know immediately. Before users notice. Before it becomes a problem.

The monitoring uses some Ruby metaprogramming that I'm oddly proud of: monkey-patching ActiveRecord to intercept all database queries, measuring execution time, logging slow queries with stack traces. It operates silently when everything's fine, only alerting when thresholds exceed.

Is it perfect? No. Does it catch most performance issues? Yes. And that's what matters.

The certificate management chaos

Third problem: duplicate certificates appearing through API usage. Agency uploads gas safety certificate via API, realizes they uploaded wrong file, uploads again. Now they have two gas safety certificates for the same property. Which one is current?

The UI didn't handle this well. Pagination broke when navigating backwards due to Alpine.js key reuse bugs. Viewing certificates required extra clicks through modals. Sorting was wrong (oldest first instead of newest first).

This week I added:

Duplicate detection: Rake task that finds duplicate certificates (same property, same type) and reports them.

Cleanup tools: Automatically keep newest certificate, delete older duplicates.

Better sorting: Newest certificates first, matching user expectations.

Direct links: Click certificate name, PDF opens. No modal, no extra steps.

Pagination fixes: Unique keys for Alpine components preventing DOM reuse bugs.

Boring polish work. But it makes the system feel professional instead of rough around the edges.

The API improvements nobody asked for (yet)

While fixing certificate management, I added DELETE endpoints to the properties API. Because eventually someone's going to need to remove a certificate via API, and they'll be frustrated when the endpoint doesn't exist.

Building features before anyone asks is usually premature. But API endpoints are different—once you ship an API without proper coverage, adding endpoints later means versioning complexity. Better to build complete APIs from the start.

Also added proper API error responses with meaningful messages. Because cryptic error codes are the worst.

What agencies actually get

If you're running a letting agency, here's what this week means for you:

Faster property pages. Load times dropped from 4-5 seconds to under 1 second. Browsing properties is now responsive instead of frustrating.

Better certificate management. Duplicates get detected and cleaned. Pagination works reliably. Viewing certificates requires fewer clicks.

More reliable system. Performance monitoring catches issues before they become problems. When something slows down, I fix it immediately.

Complete API. If you're integrating with other systems, the API now has proper DELETE endpoints and clear error messages.

This week's work is mostly invisible. You won't notice the monitoring infrastructure. You won't think about image preprocessing. You'll just know that pages load fast and certificates work properly.

That's the goal—production readiness means users don't think about the technology, they just get work done.

What I learned

This week taught me that feature completeness isn't the same as production readiness.

I could have kept adding features—landlord portals, maintenance workflows, whatever. But if the system is slow, users won't care about new features. They'll just find a faster alternative.

Performance monitoring should exist from day one. I waited too long to build it. Relying on user complaints to discover performance issues is embarrassing. Users shouldn't be your monitoring system.

Edge case handling matters. Duplicate certificates? Rare. But when it happens, the system needs to handle it gracefully. Polish is about handling edge cases without making users work around bugs.

Also: background job preprocessing is your friend. Any expensive operation that can happen asynchronously should happen asynchronously. Don't make users wait for work that doesn't need to block their request.

What's next

Thirteen weeks. From empty Rails app to production-ready letting management platform.

Built property management, landlord tracking, offline inspections, AI-powered reports, Google Workspace integration, real-time collaboration, key management, utility meters, and comprehensive performance monitoring.

Is it perfect? No. Will it ever be? Also no. Software is never finished.

But it's production-ready. Agencies can manage hundreds of properties reliably. Performance is good. Features work. Edge cases are handled.

Next? Honestly, probably a break from building. Time to talk to more agencies, understand what's working, what's not, what's missing.

Because the best way to build software isn't guessing what users need—it's listening to what they're struggling with and building solutions to actual problems.

Thirteen weeks was intense. But we built something real. Something useful. Something that actually helps letting agencies do their jobs better.

Not bad for three months.

Monday, November 17, 2025