
I Chose DynamoDB for a Social App. Here's Why I'd Undo It.
I picked DynamoDB through Amplify for a professional networking app with swipeable cards. The first two weeks were magic. Then we needed queries DynamoDB was never built to answer.
Here's the query that broke everything:
"Show me nearby professionals I haven't swiped on, in my industry, sorted by relevance."
Postgres can answer that in one query. DynamoDB via Amplify turned it into 9 database calls, 3 in-memory joins, and a Lambda function I'm still embarrassed about.
I built a professional networking app. Think LinkedIn's connection model with Tinder's card interface. Swipe right on people you want to connect with, match when both swipe right, skip the awkward connection request.
The stack? React Native on the frontend. AWS Amplify for the backend. Cognito for auth, AppSync for GraphQL, DynamoDB for storage. Amplify auto-generated the API layer, the resolvers, the tables. I had a working backend in two days.
Two days. That number mattered to me at the time. It shouldn't have.
Why Amplify made sense
I'm not going to pretend the decision was reckless. It wasn't. Every checkbox checked:
Serverless. No database to manage. Scales automatically. Free tier covers early development. Auth, API, and database wired together out of the box. GraphQL schema generates your entire backend.
For a mobile app with a tight deadline and a small team, this pitch is almost impossible to refuse. You write a GraphQL schema, run amplify push, and AWS provisions Cognito, AppSync, and DynamoDB tables for you. It handles auth flows, API endpoints, and database access patterns. All from one CLI command.
Week one was genuinely impressive. User profiles worked. Swipe recording worked. Match detection (both users swipe right) worked via DynamoDB Streams. I built the core loop in days, not weeks.
But here's what I didn't realize: Amplify optimized for the easiest queries. The product I was building depended on the hardest ones.
The first crack
It showed up around month two. Product wanted a "mutual connections" feature. Show how many shared connections two users have before you swipe.
In Postgres:
SELECT COUNT(*) FROM connections
WHERE user_id IN (
SELECT connected_to FROM connections WHERE user_id = $current_user
)
AND user_id = $candidate_id;
One query. Runs in milliseconds with the right index.
In DynamoDB, there's no join operation. None. So the approach becomes:
- Fetch current user's connection list from Connections table
- Fetch candidate user's connection list from Connections table
- Intersect the two lists in application code
- Repeat for every card you show
That's 2 database calls per card. Show 20 cards, that's 40 calls just for the "mutual connections" badge. And the connection lists grow with time. A user with 500 connections means fetching and intersecting 500-item arrays on every swipe.
I built it. It worked. But "worked" is doing heavy lifting in that sentence.
The query that couldn't exist
Month three brought the real problem. The swipe feed itself.
A swipeable card app IS a recommendation engine. The core question is always: "Which profile do I show next?" For a professional networking app, that means:
- Within 10 miles of the user's current location
- In a matching industry or role
- Not already swiped on (could be thousands of user IDs)
- Sorted by a relevance score
- Paginated, 20 at a time
DynamoDB has no native geospatial queries. Zero. The workaround is geohashing: split the map into grid cells, store each user's geohash as a partition key, and query by cell. To cover a 10-mile radius you need to query 9 cells (the user's cell plus all adjacent ones). That's 9 separate DynamoDB queries before you've filtered anything.
But filtering is where it falls apart completely. DynamoDB filters happen AFTER it reads items from the table. You can't push the exclusion set (thousands of already-swiped user IDs) into the key condition. So DynamoDB reads every user in the geohash cell, returns them all, and your application code filters out the ones already swiped.
At 5,000 users in a metro area, the "show next card" endpoint was doing:
- 9 geohash queries to DynamoDB (nearby users)
- 1 query to fetch the current user's swipe history
- In-memory filter: remove already-swiped users
- In-memory filter: apply industry/role matching
- In-memory sort: by relevance score, then distance
- Return 20 results
Compare that to the Postgres version with PostGIS:
SELECT u.*, s.score,
ST_Distance(u.location, ST_MakePoint($lng, $lat)) AS dist
FROM users u
JOIN scores s ON u.id = s.user_id
WHERE ST_DWithin(u.location, ST_MakePoint($lng, $lat), 16093)
AND u.industry = ANY($target_industries)
AND u.id NOT IN (
SELECT swiped_id FROM swipes WHERE swiper_id = $me
)
ORDER BY s.score DESC, dist ASC
LIMIT 20;
One query. Geolocation, filtering, exclusion, sorting, pagination. All in the database where it belongs.
The Amplify tax
Amplify made this worse in ways I didn't expect.
Scan by default. Amplify's auto-generated resolvers use DynamoDB Scan operations for any query with filters. Not Query. Scan. That means reading every item in the table, then filtering. At 10,000 users, a "list users in my city" query reads 10,000 items and returns maybe 200. You pay for all 10,000 reads.
There's actually a GitHub issue about this that's been open since June 2024. Amplify's generated resolvers ignore Global Secondary Indexes for one-to-many relationship queries, falling back to Scan when a Query would work. One developer reported that fixing this manually saved $15,000 per year and cut page load times from 3 seconds to 300 milliseconds. Same data. Same database. Just using it correctly.
Tables everywhere. DynamoDB best practice (from AWS's own documentation) is single-table design. Put all entity types in one table, use composite keys to model relationships. Amplify does the opposite. Every @model in your GraphQL schema creates a separate DynamoDB table. User table. Swipe table. Match table. Connection table. ProfileView table. Five tables minimum for a networking app.
The problem? AppSync can't join across tables. So every query that touches multiple entity types requires either a pipeline resolver (complex, fragile) or multiple round trips from the client. The GraphQL that was supposed to simplify data fetching now orchestrates 5 separate DynamoDB calls per screen load.
VTL hell. Amplify writes its resolvers in Apache Velocity Template Language. VTL has no unit testing framework, no IDE support worth mentioning, no type safety, and debugging requires tailing AppSync CloudWatch logs with a 30-second delay. When Amplify's generated VTL does the wrong thing (like Scanning instead of Querying), you're reading auto-generated template code to figure out why.
I could have replaced VTL with Lambda resolvers. But Lambda adds cold starts (200-500ms at 128MB) to every query, which is brutal for a swipe feed that needs to feel instant. JavaScript resolvers were the better option, but Amplify defaults to VTL, and switching requires rewriting the entire resolver layer.
The bill
Month four. AWS billing alert.
The DynamoDB bill went from $12 to $800 in one month. The growth wasn't gradual. It spiked because on-demand DynamoDB charges per read/write unit, and every feature we'd built was using Scans instead of Queries. We were reading the entire User table on every swipe feed request, the entire Swipe table to build exclusion sets, and the entire Connection table for mutual connection counts.
The fix wasn't simple. It required:
- Designing new Global Secondary Indexes for each access pattern
- Backfilling existing data into the new indexes
- Rewriting Amplify's generated resolvers (or replacing them with custom code)
- Monitoring for hot partitions after the changes
Each GSI change meant a DynamoDB migration. Not ALTER TABLE ADD INDEX. A new GSI provisioning, a backfill Lambda, a code deployment, and a prayer that you didn't hit the 20-GSI limit per table.
We got the bill back down. But the engineering time spent fighting the database could have built three features.
What DynamoDB actually excels at
I want to be specific about this, because "DynamoDB is bad" isn't the lesson. DynamoDB is bad for what I used it for.
Instacart migrated FROM Postgres TO DynamoDB for their product catalog, hundreds of millions of SKUs with simple access patterns: get product by ID, list products in category. Write-heavy, key-value shaped, horizontal scale beyond a single Postgres node. DynamoDB was the correct choice.
DynamoDB is excellent for:
- Session stores and feature flags (key-value lookup by ID)
- IoT event ingestion (high-throughput writes, partitioned by device + time)
- Idempotency tables (check-and-set by unique key)
- Shopping carts, game state, anything where the access pattern is "get this one thing by its ID"
A social network is none of those things. It's a graph. Relationships between users are the core data structure. Every interesting query involves traversing or filtering across relationships. That's relational by definition.
What I'd pick today
Postgres. On one of the newer serverless options (RDS Serverless v2, Neon, or Supabase) that address the only valid argument Amplify had: you don't want to manage a database server during MVP.
Neon has true scale-to-zero. Supabase gives you Postgres plus auth plus row-level security (the thing Amplify's AppSync does badly in application code). Both support PostGIS.
The features I spent months building workarounds for would have been one-day SQL queries:
- Mutual connections: a single join with a count
- Nearby users: PostGIS
ST_DWithinwith arbitrary filters in the same query - Swipe exclusion:
NOT INsubquery - Feed ranking:
ORDER BY score DESC, distance ASC - Profile views, analytics, admin dashboards: SQL. Just SQL.
Cost at MVP scale: $0-30/month versus the $800+ DynamoDB spike we hit with 5,000 users.
Instagram ran on Postgres at 30 million users. Your networking MVP doesn't need DynamoDB's throughput. It needs query flexibility.
The decision framework I use now
Before choosing a database, I ask four questions:
What is my hardest query? Not the most common one. The hardest. For a social app, it was the recommendation feed. For an e-commerce app, it might be search. For an IoT platform, it might be time-series aggregation. The hardest query determines the database, not the easiest one.
How stable are my access patterns? If I know every query the app will ever run (I'm cloning a proven product with a frozen spec), DynamoDB can work. If I'm building a startup where the product will pivot three times in year one, I need a database that handles ad-hoc queries without a migration.
What's the real scale target? Postgres handles 100K requests per second on modern hardware. If my year-one target is 10,000 users, I don't need a database designed for Amazon's internal traffic.
Am I choosing for deployment or for data? This is the one I got wrong. Amplify's pitch was about deployment speed, not data modeling. "Backend in a weekend" is a deployment claim. It says nothing about whether the database fits your data.
The uncomfortable part
The hardest thing about this mistake wasn't the bill or the performance. It was the velocity tax.
Every feature request turned into an architecture discussion. "Can we add industry filters to the feed?" became a two-day conversation about GSI design, backfill strategy, and read capacity planning. In Postgres, it would have been AND industry = $1. Done by lunch.
My team stopped thinking about the product and started thinking about the database. That's the real cost. Not the $800 AWS bill. The months of engineering time spent making DynamoDB do things Postgres does out of the box.
Choose your database for the queries you'll write, not the deploy you'll run.
Get new posts in your inbox
Architecture, performance, security. No spam.
Keep reading
Lambda Durable Functions Are Not Step Functions Replacements
AWS Lambda Durable Functions look like Step Functions killers. They're not. Here's when each one wins, what the checkpoint-and-replay model actually costs, and the architectural patterns I'd use in production.
The MCP vs CLI Debate Is Missing the Point
Everyone's arguing whether AI agents should use MCP or CLI tools. The answer depends on a question nobody's asking: does the model already know how to use the tool, or did your team build it last Tuesday?
MinIO Is Dead. Here's What Your Infrastructure Team Should Do Next.
60,000 GitHub stars. One billion Docker pulls. Officially archived. MinIO's five-year wind-down from Apache 2.0 to AGPL to dead is the most dramatic open-source infrastructure collapse in years. Here's the migration playbook.