Supabase RLS failing in production
If Supabase row level security is failing in production, the bug is usually not the policy text alone. It is drift between roles, environments, and assumptions about who can do what.
RLS incidents get expensive fast because teams often 'fix' them by weakening security before they understand the failure path. The fastest way back is to confirm who the actor is, which token and role are live, and where production differs from the mental model that worked locally.
Queries that worked locally suddenly return empty data or permission errors in production.
Users can see too much or too little data depending on which login path or token they used.
A server-side path works with the service role while the client path fails with real user tokens.
A hotfix changed the policy, but nobody can explain whether the result is actually safe.
Verify which role is running for the failing path: anon, authenticated, or service_role.
Inspect the production JWT claims and compare them to what the policy expects, especially org, tenant, or user identifiers.
Check for environment drift between local, preview, and production database schema or policy versions.
Confirm whether the query path changed recently, for example from client-side access to a server action or edge function.
Policies depend on claims or joins that are not present in production tokens or production data.
The application is using the wrong key or role for the request path that is failing.
Local and production schema or policy versions drifted after a fast migration or manual patch.
A rushed bypass or server-side workaround hid the underlying auth model instead of fixing it.
Reproduce the failure with the exact production role and claims that the request path uses.
Reduce the policy to the smallest safe condition set, then add complexity back only after each step is proven.
Separate client, server, and admin access paths so the intended role model is obvious in code and in logs.
Document the final access assumptions before the next deployment so the team stops rediscovering them during incidents.
Escalate when the system is already lying
Once event history is untrusted, debugging slows down fast. That is when a short rescue engagement earns its keep.
The team is considering disabling or bypassing RLS just to keep production moving.
Nobody can state with confidence which roles should have access to which records right now.
The issue crosses customer data boundaries, regulated data, or shared-tenant isolation risk.
FAQs
Short answers for the questions that usually come up once the problem is real.
Start with the audit before the next expensive wrong turn
The audit is built for exactly this stage: one workflow, one production problem, or one decision that needs to get clearer before more time is burned.
Related pages
Follow the next most relevant path based on the same decision, workflow, or rescue pattern.