SaaS Starter

Build: SaaS Starter

Toggle features and choose options to customize your spec

Technical Spec
Sign in to save your specSign in
Sign in with GitHub
Enterprise single sign-on

Access Control*

No roles — every authenticated user has the same access
Simple two-tier access control
Custom roles with fine-grained permissions

Multi-factor Authentication

Single-factor only
Time-based one-time passwords

Tradeoffs

ComplexityOAuth providers added

Each provider requires an OAuth app registration and key rotation policy

ComplexitySAML/SSO selected

Requires IdP partnership and XML-based protocol handling; significant integration work

ComplexityRBAC selected

Permission checks must be applied consistently across every data access path

Payments & BillingRequiredhigh

Billing Model*

Single charge per product or service
Monthly or annual recurring billing
Charge based on consumption (API calls, seats, etc.)

Payment Processor*

Full-featured; best-in-class developer experience
Merchant of record; handles VAT/tax automatically

Tradeoffs

ComplexitySubscription billing selected

Requires handling trial periods, dunning, proration, and cancellation flows

ComplexityUsage-based billing selected

Must instrument every billable action and send metered events to billing provider

UXPaddle selected

Less customizable checkout; Paddle acts as legal seller so you avoid VAT registration

Notificationsmedium

Delivery Method*

Push notifications instantly as events occur
Client polls server on a fixed interval
Send email when user is offline
Native push notifications for mobile apps

User Control

Simple global on/off toggle
Separate preferences per event category
Highly granular per-item preferences

Tradeoffs

CostReal-time delivery selected

Requires persistent connection infrastructure (e.g. Redis pub/sub, WebSocket server)

LatencyPolling selected

Higher server request volume; notifications may lag by poll interval

ComplexityMobile push selected

Requires APNs/FCM credentials and certificate management

ComplexityPer-source granularity selected

Significantly more complex preference storage and UI

Roles & PermissionsRequiredhigh

Authorization Model*

A fixed enum on the user record gates admin-only routes.
Users are assigned roles; roles bundle permissions; code checks permissions, not roles.
Permissions derived from attributes or graph relationships (owner, member, parent folder, etc.).

Permission Scope*

Permissions apply across the entire product.
A user has different roles in different workspaces or organizations.
Access lives on the resource itself — share a single document with specific users.

Custom Role Management

Roles (admin / member / viewer) are defined in code; customers cannot change them.
Admins can create roles and assign permissions.

Tradeoffs

ComplexitySimple roles chosen

Fast to build but every 'special case' access rule becomes bespoke code that's hard to audit

ComplexityABAC / ReBAC chosen

Requires a policy engine and relationship store kept in sync with primary data

LatencyPer-resource scope enabled

Every list/read query must filter by ACL — expect query-plan work and caching investment

CostCustomer-defined roles enabled

Support load increases substantially — each customer now has a unique permission configuration

Multi-tenancy & WorkspacesRequiredhigh

Data Isolation Model*

Every table has a tenant_id column; every query filters on it.
One Postgres schema per tenant; tables are replicated across schemas.
Each tenant gets an isolated database (sometimes isolated VPC).

Tenant Identification*

Tenant context derived from hostname; clean URLs and cookie scoping.
Tenant context read from the URL path.
One URL; user picks active tenant after login and it persists in their session.

Per-tenant Configuration

Enable/disable features for specific tenants (beta access, custom enterprise features).
Logo, colors, and optionally custom domain in tenant-facing UI.
Auth policy, session duration, data retention, allowed domains — configured per tenant.

Tradeoffs

ComplexityShared DB, shared schema

Cheapest ops and easiest queries, but a single missed tenant_id filter is a data leak

CostDatabase-per-tenant

Operational cost scales with tenant count; migrations must run against every tenant DB

ComplexitySubdomain routing

Wildcard TLS cert and DNS configuration required; vanity domains compound that

ComplexityPer-tenant feature flags

Testing surface expands — every change must consider flag combinations and tenant configs

Transactional EmailRequiredmedium

Delivery Provider*

Third-party email API with APIs, templates, and deliverability monitoring.
Cheapest at volume; minimal tooling built in.
Your own MTA (Postfix, Haraka) on your own IPs.

Deliverability Setup*

Authenticate your sending domain; publish a DMARC policy; monitor reports.
Consume bounce and spam-complaint webhooks; suppress invalid or unsubscribed addresses.
Different sending domains / IPs for product email vs marketing campaigns.
Your own sending IP instead of a shared pool.

Templating Approach*

Email templates live in your repo, reviewed and tested like any other code.
Templates managed in the provider dashboard; non-engineers can edit.
Single service manages email, in-app, SMS, and push with per-user preferences.

Tradeoffs

CostManaged provider chosen

Vendor cost scales with volume; deliverability expertise comes included

ComplexitySES chosen

Low per-email cost but you own deliverability operations (reputation, bounces, suppression)

ComplexitySeparate streams for marketing vs transactional

Two sending configurations and domains to maintain — worth it for deliverability isolation

Audit Loggingmedium

What to Log*

Sign-in, sign-out, failed login, password reset, MFA enrollment, session revocation.
Role changes, permission grants, user invites, impersonation sessions, configuration changes.
Record who viewed or exported which records.
Record state changes to business records with before/after values.

Storage Backend*

An `audit_log` table with insert-only permissions; never updated or deleted.
Stream audit events to S3 Object Lock, AWS QLDB, or a dedicated audit platform.
Primary audit storage lives in the customer or your SIEM.

User-facing Surface

A searchable, filterable log in your admin panel showing recent events.
A paginated API or scheduled export letting customers ingest audit data themselves.

Tradeoffs

CostData access logging enabled

Read amplification — every authenticated read produces a log write

ComplexityImmutable store chosen

Two storage systems to operate and keep in sync; queries may need to federate

ComplexityAppend-only table only

Tamper-evidence relies on DB role permissions — insufficient for some compliance regimes

Background Jobs & QueuesRequiredmedium

Queue Backend*

Jobs are rows in a Postgres/SQLite table; workers SELECT FOR UPDATE SKIP LOCKED.
Durable queue on Redis with mature worker libraries.
Purpose-built broker with durability, backpressure, and multi-consumer support.

Required Capabilities*

Recurring jobs (nightly emails, weekly reports) defined in code or a UI.
Enqueue a job to run at a specific future timestamp (trial expiry, reminder emails).
Separate queues for high-priority (user-triggered exports) vs bulk (nightly batches).
Throttle jobs that call rate-limited external APIs.

Failure & Durability*

Failed jobs retry with increasing delay; after N attempts move to a dead-letter queue for human review.
Jobs are enqueued inside DB transactions; handlers are idempotent so safe retries do not duplicate effects.

Tradeoffs

ScalabilityDatabase-backed queue at high volume

Primary DB absorbs queue write load; row-level locks contend with application queries

ComplexityRedis-backed queue without outbox

Enqueue happens outside DB transaction — jobs can run for state that was rolled back

ComplexityTransactional outbox chosen

Additional table, polling worker, and idempotency discipline — the payoff is no duplicated side effects

Rate Limiting & Abuse Preventionmedium

Rate Limit Algorithm*

Count requests per user per minute; reset at the minute boundary.
Counts over a rolling window to eliminate boundary bursts.
Each identity has a bucket that refills at a steady rate; each request consumes a token.

What to Limit By*

Limits keyed on client IP.
Limits keyed on user ID or API key.
Expensive endpoints (search, export) get lower limits than cheap ones.
Aggregate limit across all users in a workspace.

Abuse Prevention Layer

hCaptcha / Turnstile on signup, login, and password reset — triggered on threshold breach, not every request.
ML-driven client fingerprinting to score requests as human/bot before routing.
Cloudflare, AWS WAF, or Fastly with OWASP and bot rule packs in front of your origin.

Response Behavior*

Return HTTP 429 with Retry-After, X-RateLimit-Limit, and X-RateLimit-Remaining so clients back off correctly.
After threshold, delay responses by seconds instead of returning 429.

Tradeoffs

UXPer-IP limits only

False positives behind corporate NATs; attackers bypass with rotating proxies

ScalabilityPer-tenant limits enabled

Noisy-neighbor protection — one tenant cannot starve others

CostToken bucket chosen

Allows bursts but requires a per-identity bucket state in Redis — higher memory footprint

LatencyWAF added in front

Meaningful latency cost at the edge if the WAF is geographically distant from users

Onboarding & Activationmedium

Onboarding Format*

No dedicated onboarding UI — every empty screen contains a clear primary action pointing to the next step.
A dismissible checklist ("Invite teammate • Create project • Connect integration") visible until complete.
Step-through overlays point at UI elements on first use.
User cannot access the product until they complete N configuration screens.

Personalization Signals

Ask one or two questions to route the user to a tailored first experience.
Present starter templates ('Blank', 'Team docs', 'Marketing site') as the first interaction.
Every new workspace starts with an example project the user can play with.

Activation Support

Email nudges when a user signed up but has not yet hit the key activation action (e.g. created their first project).
A persistent help button that opens relevant docs or a short walkthrough based on the current page.
Intercom-style chat surface active for new users in their first few days.

Tradeoffs

UXSetup wizard as format

Controls first-experience but introduces sign-up drop-off proportional to wizard length

ComplexityActivation emails enabled

Requires event tracking + scheduled jobs + segmentation infrastructure

CostLive chat during onboarding

Staffing cost scales with signup volume — not viable for self-serve products below a certain ACV

Summary

User System & Auth
Payments & Billing
Notifications
Roles & Permissions
Multi-tenancy & Workspaces
Transactional Email
Audit Logging
Background Jobs & Queues
Rate Limiting & Abuse Prevention
Onboarding & Activation

10 of 10 features enabled

Effort Estimate

10+ weeks

5+ engineers

10 enabled features

Key Decisions

User System & Auth

Will this product be sold to businesses (B2B)?

If yes

Add SAML/SSO and RBAC. Enterprise procurement often requires both.

If no

Email + password plus one OAuth option covers 95% of consumer use cases.

Apply:

User System & Auth

Is this a security-sensitive application?

If yes

Enable TOTP MFA. Consider making it mandatory for privileged users.

If no

MFA is optional — offer it but do not require it to reduce friction.

Apply:

User System & Auth

Email+password, passwordless, or SSO-only?

If yes

Passwordless (magic links or passkeys) eliminates password reset tickets and credential stuffing risk.

If no

Keep email+password as a universal fallback — OAuth outages should not lock users out.

Apply:

User System & Auth

Do you need social providers (Google, GitHub, Apple)?

If yes

Add Google for B2C breadth; add GitHub for developer tools; add Apple only if you ship iOS (App Store requires it when you offer other social login).

If no

Skip social OAuth and avoid the app registration / key rotation overhead.

Apply:

User System & Auth

Do you need SCIM provisioning?

If yes

Add SCIM alongside SAML — enterprise IT uses it to auto-provision/deprovision employees and map group membership to roles.

If no

Manual invite flows are fine until your first enterprise customer asks for SCIM in a security review.

Apply:

User System & Auth

Should MFA be required, optional, or risk-based?

If yes

Risk-based (step up MFA on new device, new IP, or sensitive actions) gives security without friction on every login.

If no

Offer MFA as optional first; require it only for admins or on privileged actions.

Apply:

User System & Auth

Which MFA factors will you support (TOTP, SMS, WebAuthn/passkeys, hardware keys)?

If yes

Prefer WebAuthn/passkeys and TOTP. Avoid SMS as a primary factor — SIM swapping is a real threat.

If no

TOTP alone (Google Authenticator, Authy) covers the vast majority of users with minimal implementation cost.

Apply:

User System & Auth

Do you need device fingerprinting or trusted-device flows?

If yes

Remember trusted devices for 30 days to skip MFA; challenge on new device or changed fingerprint.

If no

Re-prompt MFA on every login — simpler and safer for low-volume or highly sensitive apps.

Apply:

User System & Auth

Offer passkey-only sign-in?

If yes

Passkeys eliminate passwords entirely — use WebAuthn with platform authenticators. Still keep an email recovery path for lost devices.

If no

Offer passkeys as an optional second factor; users without compatible devices keep using passwords.

Apply:

User System & Auth

Support staff impersonation of user accounts?

If yes

Add an impersonation flow that logs both the staff identity and the target user, with a visible banner in the impersonated session.

If no

Skip impersonation — instead build admin-side read views and support tooling that do not require acting as the user.

Apply:

User System & Auth

Captcha or bot detection on signup?

If yes

Add hCaptcha or Cloudflare Turnstile on signup and password reset — invisible challenges avoid user friction.

If no

Skip captcha for internal tools or invite-only products where bot signups are not a realistic threat.

Apply:

User System & Auth

Use lockout or rate-limit throttling for credential stuffing?

If yes

Exponential rate limits per IP and per account — lockouts create support tickets and denial-of-service vectors via targeted lockout.

If no

If account takeover risk is low, a simple fixed rate limit (e.g., 10 attempts per 15 min) is sufficient.

Apply:

User System & Auth

Allow multiple concurrent sessions per user?

If yes

Show active sessions in account settings with a revoke button — expected behavior for any multi-device product.

If no

Single-session apps (banking, compliance) should terminate old sessions on new login.

Apply:

Payments & Billing

Is the revenue model recurring (SaaS)?

If yes

Choose subscription billing. Evaluate usage-based if pricing scales with consumption.

If no

One-time purchase is far simpler. Consider Stripe Checkout for a no-code option.

Apply:

Payments & Billing

Is global VAT/sales tax compliance a concern?

If yes

Use Paddle as merchant of record — they handle tax across jurisdictions.

If no

Stripe gives more control; integrate TaxJar or Stripe Tax if needed later.

Apply:

Payments & Billing

Do you want to offer a free trial without requiring a card upfront?

If yes

Use reverse trials — free-tier access with a prompt to add a card at the end. Higher signup conversion but lower trial-to-paid conversion.

If no

Card-required trials filter out tire-kickers and produce 2–3x higher trial-to-paid rates. Stripe supports both via checkout.

Apply:

Payments & Billing

Do you sell to enterprise customers with procurement processes (POs, net-30 terms)?

If yes

Support invoicing workflows (Stripe Invoicing or manual PDF invoices via finance). Self-serve credit-card checkout is insufficient at that ACV.

If no

Credit-card-only is simpler and covers all SMB/prosumer use cases.

Apply:

Payments & Billing

Do you sell in markets where customers transact in non-USD currencies?

If yes

Enable multi-currency pricing in Stripe or Paddle. Price in local currency — EU/UK customers strongly prefer EUR/GBP over USD conversions.

If no

USD-only is simpler; add currencies only when a market demands it.

Apply:

Payments & Billing

Will you close large B2B deals that need ACH or wire transfer (>$5k)?

If yes

Enable Stripe ACH Credit Transfer or wire instructions on invoices. Credit-card fees on large invoices are prohibitive.

If no

Card-only is fine for SMB and prosumer ticket sizes.

Apply:

Payments & Billing

Do you have EU customers (SCA/3DS compliance required)?

If yes

Use Stripe Payment Intents (handles 3DS authentication automatically) or Paddle. Do not use raw Charges API — it predates SCA.

If no

Still use modern Payment Intents API — SCA will apply to US processors eventually.

Apply:

Payments & Billing

Is self-serve cancellation acceptable, or do you need "contact us to cancel"?

If yes

Self-serve cancellation via Stripe Customer Portal — legally required in California (FTC Click-to-Cancel) for many subscriptions.

If no

Contact-us friction increases short-term retention but damages NPS and is increasingly regulated. Think twice.

Apply:

Payments & Billing

Do marketing or sales teams need to issue coupons and discounts?

If yes

Use Stripe Coupons / Paddle Discounts. Support percentage and fixed-amount discounts with expiry and redemption limits.

If no

Skip — discount UX adds complexity and is rarely needed outside marketing-led motions.

Apply:

Payments & Billing

Do users frequently upgrade/downgrade mid-cycle?

If yes

Enable proration in Stripe (proration_behavior: create_prorations). Immediate upgrade + end-of-period downgrade is the customer-friendly pattern.

If no

Wait-until-renewal plan changes are simpler; skip proration logic.

Apply:

Payments & Billing

Do you have >1000 paying customers or expect significant failed-payment volume?

If yes

Enable Stripe Smart Retries (free) plus a custom dunning email sequence (day 0, 3, 7, 14). Recovers 30–50% of failed payments.

If no

Default Stripe retries are enough; add custom dunning once failed payments become a meaningful revenue leak.

Apply:

Payments & Billing

Should customers manage their own billing (payment methods, invoices, plan changes)?

If yes

Use Stripe Customer Portal or Paddle Retain — pre-built UI, handles tax/invoices/cancellation. Massive support-ticket reducer.

If no

Build a minimal billing page and route the rest to support — only viable at low customer counts.

Apply:

Payments & Billing

Do you connect buyers and sellers and need to split payments (marketplace)?

If yes

Use Stripe Connect (Standard or Express accounts). Do not build split payments yourself — tax forms, KYC, and payouts are legal minefields.

If no

Standard direct charges are simpler and correct for first-party sales.

Apply:

Payments & Billing

Do you need to issue refunds regularly with reason tracking and approval flows?

If yes

Build an internal refund tool that captures reason, links to audit log, and uses Stripe Refunds API. Required for support scale and compliance.

If no

Manual refunds through the Stripe dashboard are fine until volume demands tooling.

Apply:

Payments & Billing

Are you tempted to store raw card numbers to avoid re-entry (PCI Level 1 scope)?

If yes

Do not. Use Stripe Payment Methods or Paddle saved cards — tokenized references keep you out of PCI DSS Level 1 scope. The compliance overhead is massive.

If no

Good — always tokenize. Stripe Elements, Paddle Checkout, or hosted checkout keeps card data off your servers entirely.

Apply:

Notifications

Do users need to know about events immediately?

If yes

Use real-time delivery. Budget for WebSocket/SSE infrastructure.

If no

Polling is simpler and cheaper — pick an interval that matches your SLA.

Apply:

Notifications

Is this a mobile-first product?

If yes

Add mobile push (APNs/FCM). Plan for certificate rotation.

If no

Skip mobile push; web notifications or email cover most cases.

Apply:

Notifications

Do users often go offline and still need to receive notifications?

If yes

Add email fallback driven by a "last seen" timestamp. Send email only when the user has been offline >5 minutes to avoid spam.

If no

In-app real-time or polling is enough — email adds unsubscribe risk without upside.

Apply:

Notifications

Will you send multiple notification types with different urgency levels?

If yes

Adopt per-type granularity so users can mute digests without muting security alerts. Define the taxonomy up front.

If no

A single all-or-nothing toggle is enough and avoids preference-UI bloat.

Apply:

Notifications

Is notification volume high enough that users will complain about floods?

If yes

Batch similar notifications in a rolling window (e.g., "Alice and 12 others liked your post"). Entity + time-window grouping is the standard pattern.

If no

Deliver individually — grouping adds UI complexity that is not worth it at low volume.

Apply:

Notifications

Are any notifications security-critical or financially sensitive?

If yes

Require at-least-once delivery with persistent storage and retry. Pair with email fallback for the highest-urgency classes.

If no

Best-effort (fire-and-forget) is fine for social and informational notifications.

Apply:

Notifications

Do users have reasonable expectations of quiet hours / do-not-disturb?

If yes

Add a per-user quiet-hours window and a priority flag that lets critical alerts bypass it. Requires per-type granularity to know what is critical.

If no

Skip quiet hours — the added preference surface is not worth it for low-volume products.

Apply:

Notifications

Do users need to mark notifications as read / track what they have seen?

If yes

Persist notifications with a read state. Consider cross-device read-state sync via a server-side timestamp or event log.

If no

Ephemeral toast-style notifications are simpler and appropriate for low-stakes alerts.

Apply:

Notifications

Do you need delivery analytics (delivered, opened, clicked)?

If yes

Use a provider like Knock or Novu that tracks delivery funnel by channel. Essential if notifications drive revenue or retention KPIs.

If no

Skip the instrumentation — simple success/failure logs are enough.

Apply:

Notifications

Do users have multiple devices (web + mobile + desktop)?

If yes

Sync read-state across devices via a server-side last-read timestamp. Without this, users see the same notification over and over.

If no

Device-local read state is simpler and avoids backend round-trips.

Apply:

Notifications

Do you need SMS as a notification channel?

If yes

Use Twilio or Messagebird. Reserve SMS for high-priority alerts only — it is expensive per message and users churn fast on SMS spam.

If no

Skip SMS — push and email cover 99% of use cases at a fraction of the cost.

Apply:

Notifications

Are some notifications purely in-app / ephemeral (toast-style)?

If yes

Deliver via SSE/WebSocket without persisting — no storage, no retry, no read-state UI. Good for "Saved", "Uploaded" feedback.

If no

All notifications go through the durable pipeline with read-state tracking.

Apply:

Notifications

Do you need localized notification content (multi-language)?

If yes

Store notification templates with i18n keys and render per-recipient locale at send time. Avoid pre-rendered strings in the event.

If no

Hardcoded English strings are fine until you have non-English users.

Apply:

Notifications

Do you have more than 5 notification types or expect to add more regularly?

If yes

Invest in a template engine (Knock, MJML, or Handlebars) with versioned templates rather than hardcoded message strings.

If no

Hardcoded message strings in the sender are simpler and fine for a small stable set.

Apply:

Notifications

Do product/marketing teams need to preview and test-send notifications before release?

If yes

Build an internal preview tool with a test-recipient flag. Prevents production embarrassments.

If no

Skip — engineers can validate in staging until non-engineers start authoring templates.

Apply:

Notifications

Are there urgent notifications that must bypass quiet hours (security alerts, outages)?

If yes

Add a priority flag on notifications and skip DND for priority=urgent. Document this behavior so users expect it.

If no

Quiet hours apply uniformly — simpler and avoids abuse of the override.

Apply:

Roles & Permissions

Do users need different levels of access in different parts of the product?

If yes

Move beyond simple roles — at minimum, adopt RBAC.

If no

A two-value role enum on the user table is plenty.

Apply:

Roles & Permissions

Do users share individual items (documents, projects) with specific other users?

If yes

You need per-resource scope. Plan for ABAC or ReBAC now, not later.

If no

Workspace-level scope is usually sufficient.

Apply:

Roles & Permissions

Are enterprise customers asking to configure roles themselves?

If yes

Expose a role editor on the built-in permission primitives; price it.

If no

Ship a fixed role set and iterate based on feedback.

Apply:

Roles & Permissions

Do you need per-field permission granularity?

If yes

Move to ABAC or a policy engine (OpenFGA, Cerbos) — field-level rules are unmaintainable in RBAC.

If no

Row/resource-level checks are enough; keep the model coarse.

Apply:

Roles & Permissions

Do permissions need to inherit via hierarchical groups (folder → subfolder, org → team)?

If yes

ReBAC is the natural fit — Zanzibar-style graph traversal handles inheritance cleanly.

If no

Flat role-to-resource assignments are simpler and easier to debug.

Apply:

Roles & Permissions

Is deny-by-default the required posture?

If yes

Default every permission check to false; require an explicit grant. Standard for compliance-regulated products.

If no

Allow-by-default with blocklist rules is risky — only acceptable for internal tools.

Apply:

Roles & Permissions

Must users delegate access (X grants Y access to Z) without an admin?

If yes

Per-resource scope is required. Build a share action with grantor tracking in the audit log.

If no

Admin-mediated grants keep the authorization surface auditable and small.

Apply:

Roles & Permissions

Do you need policy-as-code (OPA, Cedar) managed alongside application code?

If yes

Adopt a policy engine — policies get versioned, reviewed, and tested like any source file.

If no

Keep authorization as a central module in app code until policy churn justifies the engine.

Apply:

Roles & Permissions

Do grants need to expire automatically (temporary access, contractor windows)?

If yes

Add an expires_at on every grant and a scheduled job that revokes on expiry. Critical for least-privilege compliance.

If no

Permanent grants with manual revocation are simpler but audit-unfriendly.

Apply:

Roles & Permissions

Do you need a break-glass / super-admin role for incident response?

If yes

Define it explicitly, require MFA to assume it, and audit-log every action taken under it. Keep the member list tiny.

If no

Regular admin + vendor support access covers most cases; avoid god-mode accounts.

Apply:

Roles & Permissions

Are sensitive role grants (e.g. billing-admin) subject to approval workflow?

If yes

Build a request + approve flow with a second approver on the granting side. Common in SOC 2 environments.

If no

Direct admin grants are faster — add audit logging instead.

Apply:

Roles & Permissions

Do enterprise customers need SCIM-driven group membership from their IdP?

If yes

Map SCIM groups to roles; treat the IdP as source of truth and avoid manual role edits for SCIM-managed users.

If no

In-app role management is simpler for SMB customers.

Apply:

Roles & Permissions

Do permission changes need a dedicated audit log (separate from general audit log)?

If yes

Emit a specialized authz-change stream — compliance reviewers need to query grants without sifting through all activity.

If no

Fold permission changes into the general audit log; tag them for easy filtering.

Apply:

Roles & Permissions

Do API keys need permissions separate from the user who minted them?

If yes

Give keys their own scoped permission set (typically a subset of the user's). Prevents accidental privilege inheritance.

If no

Mirror the minting user's permissions — simpler but revoking a user breaks their keys.

Apply:

Roles & Permissions

Can you push authorization into the database with row-level security?

If yes

Postgres RLS (or Supabase) centralizes enforcement at the data layer — out-of-band queries can't bypass it. Set auth context on every connection.

If no

Central policy module in app code is easier to debug and port across databases.

Apply:

Roles & Permissions

Are permission checks on your hot path (every list render)?

If yes

Cache lookups with a short TTL (30–60s) plus a revocation list checked per request — pure DB lookups will bottleneck.

If no

Uncached checks are fine; add caching only when profiler data demands it.

Apply:

Multi-tenancy & Workspaces

Do your target customers contractually require data residency or physical data isolation?

If yes

Plan database-per-tenant (or at minimum region-pinned schemas) from the start.

If no

Shared DB + tenant_id is the right starting point — do not over-isolate prematurely.

Apply:

Multi-tenancy & Workspaces

Do you expect one tenant to be >20% of total load?

If yes

Consider giving large tenants their own database to prevent noisy-neighbor impact.

If no

Shared-DB + resource limits per tenant is sufficient.

Apply:

Multi-tenancy & Workspaces

Shared DB with tenant_id, schema-per-tenant, or DB-per-tenant?

If yes

Default to shared DB + tenant_id for 95% of SaaS. Move to schema-per-tenant only as a transitional step, and DB-per-tenant only when contracts demand physical isolation.

If no

If unsure, stay with shared DB + tenant_id and invest in defense-in-depth (RLS, typed repos) instead of heavier isolation.

Apply:

Multi-tenancy & Workspaces

Do you need per-tenant custom domains (vanity domains like customer.com)?

If yes

Use subdomain routing with an ACME automation layer (e.g. Caddy, cert-manager, or a managed service like Approximated) to issue TLS certs on demand.

If no

Wildcard subdomain TLS is enough — skip the cert automation complexity until a customer actually asks.

Apply:

Multi-tenancy & Workspaces

Do customers contractually require tenant-level encryption key separation (BYOK)?

If yes

Plan DB-per-tenant or envelope encryption with per-tenant KMS keys (AWS KMS, GCP KMS). Budget significant engineering for key rotation and recovery.

If no

A single application-managed encryption key is sufficient and dramatically simpler to operate.

Apply:

Multi-tenancy & Workspaces

Do tenants need branded UI (logo, colors, custom domain)?

If yes

Enable custom-branding under features-per-tenant and store theme config per tenant. Standard for B2B2C.

If no

Skip branding entirely — one less surface to test, one less config to migrate.

Apply:

Multi-tenancy & Workspaces

Do enterprise contracts require isolated backups per tenant?

If yes

Plan DB-per-tenant or schema-per-tenant so pg_dump produces a clean per-tenant artifact. Shared-schema backups cannot satisfy this.

If no

One backup pipeline across all tenants is sufficient and cheaper to operate.

Apply:

Multi-tenancy & Workspaces

Do internal admins need cross-tenant reporting (aggregate metrics across tenants)?

If yes

Shared DB makes this trivial. With DB-per-tenant, plan a separate analytics warehouse (Snowflake, BigQuery) that ingests from all tenant DBs.

If no

No extra infrastructure needed — per-tenant queries are the only access pattern.

Apply:

Multi-tenancy & Workspaces

Do tenants need self-serve data export (GDPR portability)?

If yes

Build a per-tenant export job that streams their data as JSON/CSV. Add tenant-settings to expose the trigger. Required for GDPR compliance.

If no

Skip for now — handle ad-hoc export requests manually until a customer asks.

Apply:

Multi-tenancy & Workspaces

Do your customers need to share resources across tenants (e.g. a partner sharing a project)?

If yes

Requires explicit cross-tenant ACL modeling — avoid shared-schema tenant_id filtering assumptions. Consider ReBAC (SpiceDB, OpenFGA).

If no

Strict tenant_id isolation is the right default — never cross the boundary.

Apply:

Multi-tenancy & Workspaces

Do enterprise customers ask for tenant-scoped feature flags?

If yes

Enable feature-flags under features-per-tenant. Use LaunchDarkly, Unleash, or a simple tenant-id-keyed table — inevitable by month 6.

If no

Global flags are fine initially; add per-tenant scoping when the first enterprise deal demands it.

Apply:

Multi-tenancy & Workspaces

Do you need per-tenant rate limits (prevent noisy neighbors at the API layer)?

If yes

Key rate-limit buckets on tenant_id in Redis. Expose current usage in tenant-settings so admins can see their budget.

If no

Per-user or global limits are sufficient at your current scale.

Apply:

Multi-tenancy & Workspaces

Is tenant signup self-serve or sales-gated?

If yes

Build a provisioning flow that creates tenant context, default admin, and any per-tenant resources (schemas, S3 prefixes) atomically.

If no

Manual provisioning via internal admin is fine for enterprise-only products — skip the self-serve UX.

Apply:

Multi-tenancy & Workspaces

Do you need a tenant suspension/deactivation workflow (non-payment, abuse)?

If yes

Add a tenant status field (active/suspended/deleted) checked in middleware; persist data for the grace period before hard-delete.

If no

Hard-delete on offboarding is simpler but loses any recovery window — revisit once you have paying customers.

Apply:

Multi-tenancy & Workspaces

Will you need to migrate tenants between clusters/regions (data residency, scale)?

If yes

Design tenant data as a portable unit from day one — schema-per-tenant or DB-per-tenant makes this a dump+restore rather than a surgical export.

If no

Shared-schema is fine; cross-tenant migrations become painful but not impossible.

Apply:

Multi-tenancy & Workspaces

Do you need per-tenant observability (metrics, logs tagged by tenant)?

If yes

Tag every log line, metric, and trace with tenant_id. Use Grafana/Datadog tenant dashboards so support can diagnose individual tenants.

If no

Aggregate observability is enough for small tenant counts; add tagging once you have >50 tenants.

Apply:

Multi-tenancy & Workspaces

Do you need per-tenant audit logs for GDPR data-subject requests?

If yes

Combine with the audit-logging module — ensure every audit entry carries tenant_id and build a per-tenant export endpoint.

If no

Central audit logs with tenant_id filtering cover most cases until a DSR lands.

Apply:

Multi-tenancy & Workspaces

Do you need per-tenant queues/workers for noisy-neighbor isolation in background jobs?

If yes

Partition your queue by tenant_id (separate BullMQ/SQS queues per tenant, or weighted fair scheduling). Required for any tenant running bulk imports.

If no

A shared queue with per-tenant concurrency caps is simpler and sufficient for balanced workloads.

Apply:

Transactional Email

Will you ever send marketing email (newsletters, promotions) from the same brand?

If yes

Plan separate transactional and marketing streams from the start.

If no

A single stream is simpler; split later if you add marketing email.

Apply:

Transactional Email

Do you expect to send >100k emails/month in year one?

If yes

Evaluate SES or Postmark pricing carefully; negotiate volume discounts.

If no

Pick the best developer experience (Resend / Postmark) — the price delta is rounding at low volume.

Apply:

Transactional Email

Should you use a managed provider (SendGrid, Postmark, SES) or self-host?

If yes

Use a managed provider — Postmark for deliverability, Resend for DX, SES for cost at volume. Self-hosting is never worth it for transactional.

If no

Only self-host for regulated environments with egress constraints; expect months of reputation work.

Apply:

Transactional Email

Is a dedicated IP justified for your volume?

If yes

Above ~100k emails/month, request a dedicated IP and budget a 2–4 week warm-up. Below that, shared pools from Postmark/SendGrid are cleaner.

If no

Stay on the shared pool — reputation is managed for you.

Apply:

Transactional Email

Should mail send from an isolated subdomain (mail.yourdomain.com)?

If yes

Standard practice — protects root-domain reputation from email mistakes and makes DNS records easier to manage.

If no

Only sending from root if you have no other choice; keep SPF/DKIM alignment tight.

Apply:

Transactional Email

Is DMARC enforcement (p=reject or p=quarantine) required?

If yes

Start with p=none for reporting, then ramp to quarantine and reject once SPF+DKIM alignment is verified across all senders.

If no

Inbox providers increasingly require DMARC — plan to enforce within 6 months anyway.

Apply:

Transactional Email

Do user replies to transactional email need to drive app actions (reply-to-comment)?

If yes

Use a provider with inbound reply parsing (Postmark, SendGrid Inbound Parse) and a dedicated Reply-To subdomain with MX records.

If no

Set Reply-To to a monitored support inbox or no-reply address.

Apply:

Transactional Email

Does marketing/CX need to edit templates without a code deploy?

If yes

Use provider-hosted templates (SendGrid Dynamic Templates, Postmark) or a notification platform (Knock, Courier). Keep security emails in code.

If no

Code-owned templates (React Email, MJML) are reviewable and version-controlled.

Apply:

Transactional Email

Do templates need per-recipient personalization beyond name / link?

If yes

Use a templating engine with merge fields (Handlebars, Liquid). Provider templates handle this well; React Email makes it trivial in code.

If no

Static templates with a few variables are fine — don't over-engineer.

Apply:

Transactional Email

Do you need template versioning with rollback?

If yes

Code templates get this from git for free. For provider templates, pick one with built-in versioning (Postmark) or snapshot before edits.

If no

Direct edits are fine for low-stakes messages.

Apply:

Transactional Email

Do you need to send localized email content per recipient?

If yes

Either one template per locale (simple, duplicated) or a single template with i18n key lookups. Store recipient locale on the user record.

If no

English-only ships faster; add locales when revenue justifies it.

Apply:

Transactional Email

Do you need open and click tracking for product email?

If yes

All major providers offer it as a toggle. Useful for onboarding email analytics — but disclose tracking in your privacy policy.

If no

Disable trackers on security-sensitive email (password resets) regardless — tracking pixels in those emails look phishy.

Apply:

Transactional Email

Are bounce and complaint webhooks processed to suppress bad addresses?

If yes

Non-negotiable at any real volume. Subscribe to provider webhooks and maintain a suppression table checked before every send.

If no

You will tank your sender reputation within weeks — this is not optional.

Apply:

Transactional Email

Do you need to schedule sends for a future time?

If yes

Most managed providers support scheduled sends natively; otherwise enqueue to a delayed job queue (BullMQ, SQS with delay).

If no

Send immediately from the triggering event — simpler.

Apply:

Transactional Email

Do you send high-fan-out batches (announcement to all users at once)?

If yes

Use the provider's batch send API (SendGrid v3, Postmark batch). Chunk to stay under per-call limits and spread over minutes to avoid throttling.

If no

One-at-a-time calls via your background queue are simpler.

Apply:

Transactional Email

Can end-users (white-label customers) customize email content?

If yes

Sandbox the template language (no arbitrary code), validate on save, and preview before activating. Use a notification platform if this is core.

If no

Keep templates locked down — far fewer support tickets.

Apply:

Transactional Email

Do you need an internal preview / test-send surface for QA?

If yes

Build an admin route that lists all templates with sample data. Pair with Mailpit/Mailhog in dev to catch rendering bugs before prod.

If no

You'll hear about broken templates from customers — not recommended.

Apply:

Transactional Email

Do you need a single unsubscribe list shared across product surfaces?

If yes

Centralize in your user record or a notification platform — users unsubscribing from any email should stop all non-critical mail.

If no

Per-stream unsubscribes create support tickets; avoid if at all possible.

Apply:

Audit Logging

Are you pursuing SOC 2, ISO 27001, or a similar audit?

If yes

Plan for append-only storage with >= 1 year retention, authentication + admin + mutation scopes, and an auditor-facing export.

If no

Start with auth events in an append-only table; expand scope when a customer or incident forces it.

Apply:

Audit Logging

Do customers need to answer "who did X?" inside your product?

If yes

Build an admin audit log UI — you will be asked for one on every enterprise deal.

If no

Internal-only access is fine until you hear the first request.

Apply:

Audit Logging

Must logs be tamper-evident (hash chain / signing)?

If yes

Chain each log entry by hashing (prev_hash + payload) or sign with an HSM-backed key — required for SOX/HIPAA trail integrity.

If no

Insert-only DB permissions on an append-only table are sufficient for most internal use.

Apply:

Audit Logging

Retention: 30d, 1y, or 7y+?

If yes

Long retention (1y+): tier cold logs to S3/Glacier with lifecycle rules. 7y+ is a SOX/healthcare signal — plan storage costs.

If no

30–90 days in a hot store (primary DB or ClickHouse) covers security review timelines for non-regulated apps.

Apply:

Audit Logging

Should logs include before/after diffs on updates?

If yes

Capture a JSON diff (jsondiffpatch or a custom field-level diff) — essential for customer-facing "who changed this?" questions.

If no

Log action + resource ID only; cheaper but limits forensic value.

Apply:

Audit Logging

Log reads (access logs) or only writes?

If yes

Sample or scope to sensitive resources only — full read logging often produces 100x the write volume. Required for HIPAA.

If no

Write-only logging is the default — covers the overwhelming majority of compliance and forensics needs.

Apply:

Audit Logging

Store audit logs separately from operational DB?

If yes

Stream to a dedicated store (ClickHouse, S3, or SIEM) — isolates audit traffic from app queries and allows differing retention/permissions.

If no

An append-only table in the primary DB is simpler and sufficient at early scale.

Apply:

Audit Logging

Cryptographic signing of entries required?

If yes

Sign each entry with an HSM-backed key (AWS KMS) — provides non-repudiation beyond hash chaining.

If no

Hash-chain or insert-only permissions are enough until an auditor asks.

Apply:

Audit Logging

Exportable as SIEM-compatible (CEF, JSON)?

If yes

Offer structured JSON export and optionally CEF/LEEF for enterprise SIEMs (Splunk, QRadar) — usually gated behind a plan.

If no

A simple CSV export covers most self-serve customers.

Apply:

Audit Logging

Real-time alerts on specific events?

If yes

Route high-signal events (privilege escalation, mass delete) through a streaming pipeline (Kinesis/Kafka) into alerting — PagerDuty or customer Slack.

If no

Batch nightly review is enough for low-stakes environments.

Apply:

Audit Logging

Distinguish system actions from user actions?

If yes

Model actor as a typed union (user | system | api_key | admin) — required for any meaningful forensic query.

If no

A single actor_id field works short-term but becomes ambiguous fast — avoid.

Apply:

Audit Logging

Log IP and user-agent on every action?

If yes

Capture IP + user-agent + geo on every event — standard for security review and fraud investigations.

If no

Auth events only is the bare minimum; expect to backfill later.

Apply:

Audit Logging

Retain deleted resource IDs in logs indefinitely?

If yes

Keep resource IDs forever — critical for "what happened to record X?" questions after deletion.

If no

Honor GDPR right-to-erasure by tombstoning PII but preserving action records with hashed IDs.

Apply:

Audit Logging

Redact/tokenize PII in log bodies?

If yes

Run an allowlist + regex redaction pass at the producer before write — tokens, emails, card numbers never land in the log store.

If no

Acceptable only if logs never leave your trust boundary — avoid.

Apply:

Audit Logging

Log admin impersonation with both identities?

If yes

Record both real_actor and impersonated_user on every event during an impersonation session — required for SOC 2 and customer trust.

If no

Single-actor logs make it impossible to tell who really acted — always log both.

Apply:

Audit Logging

Is write-once storage (S3 Object Lock) a compliance need?

If yes

Stream logs to S3 with Object Lock compliance mode — cheapest credible WORM store. Neither you nor an attacker can rewrite.

If no

Insert-only DB table is enough until an auditor requires immutable storage.

Apply:

Background Jobs & Queues

Does any job charge money, send external messages, or call a paid API?

If yes

Use transactional outbox and make handlers idempotent. Store an idempotency key on the job.

If no

Standard retry + DLQ is sufficient.

Apply:

Background Jobs & Queues

Do you already run Redis or a broker?

If yes

Use it for jobs — a second persistence dependency is rarely justified.

If no

Start with a database-backed queue; migrate only when volume demands it.

Apply:

Background Jobs & Queues

Do you need scheduled/cron jobs in addition to on-demand enqueues?

If yes

Enqueue from a single scheduler process (not per-worker cron) to avoid duplicates in a horizontally-scaled deployment.

If no

Pure on-demand enqueue is simpler — add scheduled capability only when you actually have recurring jobs.

Apply:

Background Jobs & Queues

Is at-least-once delivery sufficient, or do you need exactly-once?

If yes

Exactly-once requires transactional outbox plus idempotent handlers — no library gives it to you for free.

If no

At-least-once with idempotent handlers is the pragmatic production default.

Apply:

Background Jobs & Queues

Do you have mixed-priority workloads (user-visible vs batch)?

If yes

Use at least two queues (default, bulk) with separate worker pools so a long batch job never starves user-triggered work.

If no

A single queue is simpler and fine for homogeneous workloads.

Apply:

Background Jobs & Queues

Are your handlers idempotent by contract?

If yes

Aggressive retries are safe — store an idempotency key per job and dedupe on handler entry.

If no

Lean on transactional outbox and accept retries will sometimes double-invoke side effects unless you add keys.

Apply:

Background Jobs & Queues

Do you need DAG pipelines (jobs that spawn dependent jobs)?

If yes

Use a durable workflow engine (Temporal, Inngest, BullMQ Flows) — rolling your own DAG orchestration is a year-long tarpit.

If no

Flat enqueue is simpler and covers the majority of use cases.

Apply:

Background Jobs & Queues

Do you need a dead-letter queue for failures?

If yes

Any production queue needs a DLQ with alerting on depth growth — silent job failure is a common outage source.

If no

Skip only for best-effort one-off jobs where losing the job is acceptable.

Apply:

Background Jobs & Queues

Do you need per-tenant queue isolation for noisy neighbors?

If yes

Shard queues by tenant or add per-tenant concurrency caps so one customer bursting to 10k jobs does not stall everyone else.

If no

A shared queue is fine in single-tenant or low-variance workloads.

Apply:

Background Jobs & Queues

Do long-running jobs need to be cancellable mid-run?

If yes

Pass a cancellation token through the handler and checkpoint progress so cancellation is responsive without data loss.

If no

If jobs complete quickly, retry-on-failure is simpler than implementing graceful cancellation.

Apply:

Background Jobs & Queues

Do you need per-job-type retry and backoff configuration?

If yes

Different failure modes need different backoff — network errors retry fast, rate-limit errors retry slow. Configure per job class.

If no

A single global retry policy (5 attempts, exponential backoff) is the pragmatic default.

Apply:

Background Jobs & Queues

Are jobs CPU-bound (heavy compute) or IO-bound (external calls)?

If yes

CPU-bound: use a worker pool sized to core count. Avoid async in the same process — it will not help and may hurt.

If no

IO-bound: use async/concurrent workers to maximize throughput on waiting time.

Apply:

Background Jobs & Queues

Do you need queue-depth and worker-lag observability?

If yes

Emit per-queue depth, processing latency, retry count, and DLQ size to Prometheus/Datadog — and alert on them.

If no

The built-in queue dashboard (Sidekiq Web, BullMQ Board) is enough for small teams.

Apply:

Background Jobs & Queues

Do you need to persist job results for later retrieval?

If yes

Store results in a separate results table keyed by job ID — clients poll or receive webhook/SSE when done.

If no

Fire-and-forget jobs are simpler; only persist results when a user UI depends on them.

Apply:

Background Jobs & Queues

Do job payloads contain sensitive data?

If yes

Encrypt payloads at rest (envelope encryption with KMS) — queue storage is usually less hardened than your primary DB.

If no

Plaintext payloads are fine for internal, non-PII work.

Apply:

Background Jobs & Queues

Should similar jobs be batched for efficiency?

If yes

Coalesce jobs (e.g., "send digest for user X") within a short window — a single batch handler beats N individual invocations for I/O.

If no

Per-job execution is simpler to reason about and debug.

Apply:

Background Jobs & Queues

Do you need an admin UI to list, retry, and cancel jobs?

If yes

Mount the queue library dashboard (Sidekiq Web, Oban Web, BullMQ Board) behind admin auth — zero-effort ops leverage.

If no

CLI tools and logs are enough for a small team; add UI when non-engineers need to investigate job failures.

Apply:

Rate Limiting & Abuse Prevention

Do you have unauthenticated endpoints (signup, login, public API)?

If yes

Add per-IP limits on those endpoints plus CAPTCHA on threshold. Assume credential-stuffing is attempted on day one.

If no

Per-user limits on authenticated APIs are sufficient.

Apply:

Rate Limiting & Abuse Prevention

Do different customer tiers pay for different rate limits?

If yes

Keyed-on-API-key limits with plan-configured thresholds; expose a usage endpoint.

If no

A single default limit keeps configuration simple.

Apply:

Rate Limiting & Abuse Prevention

Do you expect adversarial traffic (credential stuffing, scraping, spam)?

If yes

Use sliding window or token bucket — fixed window leaks under boundary timing attacks. Pair with WAF and bot detection.

If no

Fixed-window with Redis INCR + EXPIRE is cheap, simple, and sufficient.

Apply:

Rate Limiting & Abuse Prevention

Do you have legitimate burst patterns (batch imports, bulk API calls)?

If yes

Token bucket is the right model — allows bursts while enforcing a sustained rate. The standard for commercial API gateways.

If no

Sliding window is simpler and has lower memory overhead.

Apply:

Rate Limiting & Abuse Prevention

Do you run multiple server nodes behind a load balancer?

If yes

Use centralized Redis for rate-limit state (Upstash Ratelimit, redis-cell). Per-node local counters let attackers get N×limit by rotating through nodes.

If no

In-process counters are fine for single-node deployments and dramatically cheaper.

Apply:

Rate Limiting & Abuse Prevention

Should users get a warning before they hit a hard limit?

If yes

Emit soft-limit warnings via response headers (X-RateLimit-Remaining) and optionally an in-app notification when usage >80%. Prevents angry support tickets.

If no

Silent throttling at the hard limit is simpler but worse UX — only acceptable for internal APIs.

Apply:

Rate Limiting & Abuse Prevention

Do you have enterprise customers who negotiate custom limits?

If yes

Build an admin override table keyed on tenant/API-key. Do not hardcode limits — operations team will need to raise them without deploys.

If no

Static per-tier limits in config are simpler and easier to reason about.

Apply:

Rate Limiting & Abuse Prevention

Can a single user enqueue unbounded background jobs (imports, scrapes, AI calls)?

If yes

Rate-limit the enqueue side separately from the API side. Prevents queue-flooding attacks that bypass request-layer limits.

If no

API-layer limits are sufficient; background jobs are produced by your own code only.

Apply:

Rate Limiting & Abuse Prevention

Do you have legitimate short bursts you want to allow (e.g. pagination fan-out)?

If yes

Use token bucket with a burst allowance (bucket size > refill rate). Clients can consume the bucket quickly, then settle.

If no

A flat rate is simpler — bursts are a policy decision, not a default.

Apply:

Rate Limiting & Abuse Prevention

Are you exposed to L3/L4 DDoS (public API, unauthenticated endpoints)?

If yes

Put Cloudflare, AWS Shield, or Fastly in front of your origin. Application-layer rate limiting cannot absorb network-layer floods.

If no

Application-layer limits are sufficient for authenticated-only APIs.

Apply:

Rate Limiting & Abuse Prevention

Do some endpoints cost 100x more than others (AI calls, complex queries, exports)?

If yes

Rate-limit by computational cost (credits/tokens per request) not by request count. Pricing and abuse protection align naturally.

If no

Request-count limits are simpler and sufficient when endpoint costs are roughly uniform.

Apply:

Rate Limiting & Abuse Prevention

Do specific features (uploads, AI generations) have their own cost or quota model?

If yes

Add per-feature limits in addition to global ones. A user at their upload quota should still be able to read the API.

If no

Per-route limits are enough and keep configuration centralized.

Apply:

Rate Limiting & Abuse Prevention

Is this a public developer API with SDKs / third-party integrations?

If yes

Always return X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, and Retry-After headers. Well-behaved clients need them to back off correctly.

If no

Minimum viable is Retry-After on 429 responses — detailed headers are nice-to-have for internal APIs.

Apply:

Rate Limiting & Abuse Prevention

Do you have health checks, metrics endpoints, or internal traffic hitting rate-limited routes?

If yes

Exempt health checks and internal service-to-service calls by IP allowlist or dedicated service tokens. Otherwise monitoring will trip your own limits.

If no

Default behavior — all traffic counts — is simpler and auditable.

Apply:

Rate Limiting & Abuse Prevention

Do you have authentication endpoints at risk of credential stuffing?

If yes

Rate-limit failed logins separately (per-account + per-IP), with exponential backoff and lockout after N attempts. Combine with CAPTCHA on threshold.

If no

General per-IP limits are insufficient for auth — always treat login and password reset as a separate budget.

Apply:

Onboarding & Activation

Do you know what the "aha moment" / activation event is for your product?

If yes

Design onboarding backwards from that event. Every step should reduce friction to it.

If no

Fix this before investing in onboarding UI. Instrument product events and find activation empirically.

Apply:

Onboarding & Activation

Is the first use of your product collaborative (requires inviting a teammate)?

If yes

Surface the invite action prominently in empty states or checklist; single-user usage is often not the activated state.

If no

Optimize for a single-user first success.

Apply:

Onboarding & Activation

Is your product self-serve, or sales-led / waitlisted?

If yes

Optimize every step for a frictionless signup: single-field email, magic link, and empty-state-driven first action.

If no

Replace signup with a waitlist form + personal outreach; add a wizard (domain verification, SSO) post-sales.

Apply:

Onboarding & Activation

Is email verification strictly required before first use?

If yes

Send magic link or OTP, but let the user explore read-only state while waiting. Blocking on verification is a major conversion leak.

If no

Skip upfront verification — prompt later when the user takes an action that actually needs a verified email.

Apply:

Onboarding & Activation

Would a 1-2 question questionnaire meaningfully personalize the first experience?

If yes

Ask role/use-case upfront and route to a tailored template or empty state. Cap at 2 questions — each additional one costs conversion.

If no

Skip the questionnaire; ship the same empty state to everyone and let them self-select via templates.

Apply:

Onboarding & Activation

Does your product suffer from the blank-page problem (hard to see value empty)?

If yes

Pre-populate a labeled, deletable sample workspace. Essential for analytics, dashboards, and issue trackers.

If no

An empty-state prompt to "Create your first X" is cleaner and avoids clutter.

Apply:

Onboarding & Activation

Is team invites in the first session a leading indicator of activation?

If yes

Surface the invite step inside the checklist during onboarding — not after. A team workspace that stays single-user usually churns.

If no

Move invites out of onboarding — prompt only when the user takes a collaborative action.

Apply:

Onboarding & Activation

Does the product only deliver value once external data is connected (integrations, imports)?

If yes

Make the connect-data step the checklist centerpiece. Offer sample data as a fallback so users can explore before committing.

If no

Skip data connection in onboarding — offer it contextually when the user takes an action that needs it.

Apply:

Onboarding & Activation

Is your product complex enough that new users will miss key capabilities without guidance?

If yes

Add a short guided tour (under 5 steps) with a clear skip control. Never block the UI behind the tour.

If no

Skip the tour — empty-state prompts and in-app help cover it without the friction.

Apply:

Onboarding & Activation

Does your product surface more features as users mature (advanced settings, integrations)?

If yes

Progressively disclose — hide admin/API/settings until they are needed. Surface them via contextual prompts when the user is ready.

If no

Show the full surface from day one; simpler mental model.

Apply:

Onboarding & Activation

Do different user roles (admin, member, viewer) need materially different first experiences?

If yes

Branch onboarding on role — admins get setup checklist (billing, SSO, invites); members land directly in the workspace.

If no

One flow for all users is simpler and easier to iterate on.

Apply:

Onboarding & Activation

Can your empty states double as the onboarding surface?

If yes

Invest in empty-state-driven onboarding first. Every empty screen should show the next primary action, not marketing copy.

If no

Supplement with a checklist — but fix the empty states first; they will carry users further than any tour.

Apply:

Onboarding & Activation

Have you defined and instrumented a product activation metric (time-to-value)?

If yes

Track it in your analytics and optimize onboarding against it. Every step should either reduce TTV or get cut.

If no

Instrument this before building onboarding UI — without a metric you are guessing.

Apply:

Onboarding & Activation

Do a meaningful fraction of signups drop off before activation?

If yes

Enable behavior-triggered activation emails. A nudge at day 1 and day 3 recovers 5–15% of dropped signups.

If no

Skip — nudges on already-activated users feel spammy.

Apply:

Onboarding & Activation

Do you have the analytics maturity and signup volume to A/B test onboarding variants?

If yes

Use an experimentation framework (Statsig, LaunchDarkly, GrowthBook) and test one variable at a time.

If no

Iterate with qualitative feedback first — A/B tests need thousands of signups per week to reach significance.

Apply:

Onboarding & Activation

Does your product require payment or a card before the user sees value?

If yes

Only if you can justify it (enterprise, physical goods). For SaaS, defer card collection until after the user reaches their aha moment.

If no

Keep payment setup out of onboarding — gate it on usage or trial expiry.

Apply:

Onboarding & Activation

Do enterprise customers need to complete setup steps (domain verification, SSO, SCIM) before go-live?

If yes

Build a separate admin wizard post-signup for these. Keep end-user onboarding lightweight; admin work should not block the team.

If no

Skip the enterprise wizard entirely until you have paying enterprise customers requesting it.

Apply:

Preset

User System & AuthRequiredlow

Authentication Methods*

Classic credential-based login
One-click login via email link
Sign in with Google
Sign in with GitHub
Enterprise single sign-on

Access Control*

No roles — every authenticated user has the same access
Simple two-tier access control
Custom roles with fine-grained permissions

Multi-factor Authentication

Single-factor only
Time-based one-time passwords

Tradeoffs

ComplexityOAuth providers added

Each provider requires an OAuth app registration and key rotation policy

ComplexitySAML/SSO selected

Requires IdP partnership and XML-based protocol handling; significant integration work

ComplexityRBAC selected

Permission checks must be applied consistently across every data access path

Payments & BillingRequiredhigh

Billing Model*

Single charge per product or service
Monthly or annual recurring billing
Charge based on consumption (API calls, seats, etc.)

Payment Processor*

Full-featured; best-in-class developer experience
Merchant of record; handles VAT/tax automatically

Tradeoffs

ComplexitySubscription billing selected

Requires handling trial periods, dunning, proration, and cancellation flows

ComplexityUsage-based billing selected

Must instrument every billable action and send metered events to billing provider

UXPaddle selected

Less customizable checkout; Paddle acts as legal seller so you avoid VAT registration

Notificationsmedium

Delivery Method*

Push notifications instantly as events occur
Client polls server on a fixed interval
Send email when user is offline
Native push notifications for mobile apps

User Control

Simple global on/off toggle
Separate preferences per event category
Highly granular per-item preferences

Tradeoffs

CostReal-time delivery selected

Requires persistent connection infrastructure (e.g. Redis pub/sub, WebSocket server)

LatencyPolling selected

Higher server request volume; notifications may lag by poll interval

ComplexityMobile push selected

Requires APNs/FCM credentials and certificate management

ComplexityPer-source granularity selected

Significantly more complex preference storage and UI

Roles & PermissionsRequiredhigh

Authorization Model*

A fixed enum on the user record gates admin-only routes.
Users are assigned roles; roles bundle permissions; code checks permissions, not roles.
Permissions derived from attributes or graph relationships (owner, member, parent folder, etc.).

Permission Scope*

Permissions apply across the entire product.
A user has different roles in different workspaces or organizations.
Access lives on the resource itself — share a single document with specific users.

Custom Role Management

Roles (admin / member / viewer) are defined in code; customers cannot change them.
Admins can create roles and assign permissions.

Tradeoffs

ComplexitySimple roles chosen

Fast to build but every 'special case' access rule becomes bespoke code that's hard to audit

ComplexityABAC / ReBAC chosen

Requires a policy engine and relationship store kept in sync with primary data

LatencyPer-resource scope enabled

Every list/read query must filter by ACL — expect query-plan work and caching investment

CostCustomer-defined roles enabled

Support load increases substantially — each customer now has a unique permission configuration

Multi-tenancy & WorkspacesRequiredhigh

Data Isolation Model*

Every table has a tenant_id column; every query filters on it.
One Postgres schema per tenant; tables are replicated across schemas.
Each tenant gets an isolated database (sometimes isolated VPC).

Tenant Identification*

Tenant context derived from hostname; clean URLs and cookie scoping.
Tenant context read from the URL path.
One URL; user picks active tenant after login and it persists in their session.

Per-tenant Configuration

Enable/disable features for specific tenants (beta access, custom enterprise features).
Logo, colors, and optionally custom domain in tenant-facing UI.
Auth policy, session duration, data retention, allowed domains — configured per tenant.

Tradeoffs

ComplexityShared DB, shared schema

Cheapest ops and easiest queries, but a single missed tenant_id filter is a data leak

CostDatabase-per-tenant

Operational cost scales with tenant count; migrations must run against every tenant DB

ComplexitySubdomain routing

Wildcard TLS cert and DNS configuration required; vanity domains compound that

ComplexityPer-tenant feature flags

Testing surface expands — every change must consider flag combinations and tenant configs

Transactional EmailRequiredmedium

Delivery Provider*

Third-party email API with APIs, templates, and deliverability monitoring.
Cheapest at volume; minimal tooling built in.
Your own MTA (Postfix, Haraka) on your own IPs.

Deliverability Setup*

Authenticate your sending domain; publish a DMARC policy; monitor reports.
Consume bounce and spam-complaint webhooks; suppress invalid or unsubscribed addresses.
Different sending domains / IPs for product email vs marketing campaigns.
Your own sending IP instead of a shared pool.

Templating Approach*

Email templates live in your repo, reviewed and tested like any other code.
Templates managed in the provider dashboard; non-engineers can edit.
Single service manages email, in-app, SMS, and push with per-user preferences.

Tradeoffs

CostManaged provider chosen

Vendor cost scales with volume; deliverability expertise comes included

ComplexitySES chosen

Low per-email cost but you own deliverability operations (reputation, bounces, suppression)

ComplexitySeparate streams for marketing vs transactional

Two sending configurations and domains to maintain — worth it for deliverability isolation

Audit Loggingmedium

What to Log*

Sign-in, sign-out, failed login, password reset, MFA enrollment, session revocation.
Role changes, permission grants, user invites, impersonation sessions, configuration changes.
Record who viewed or exported which records.
Record state changes to business records with before/after values.

Storage Backend*

An `audit_log` table with insert-only permissions; never updated or deleted.
Stream audit events to S3 Object Lock, AWS QLDB, or a dedicated audit platform.
Primary audit storage lives in the customer or your SIEM.

User-facing Surface

A searchable, filterable log in your admin panel showing recent events.
A paginated API or scheduled export letting customers ingest audit data themselves.

Tradeoffs

CostData access logging enabled

Read amplification — every authenticated read produces a log write

ComplexityImmutable store chosen

Two storage systems to operate and keep in sync; queries may need to federate

ComplexityAppend-only table only

Tamper-evidence relies on DB role permissions — insufficient for some compliance regimes

Background Jobs & QueuesRequiredmedium

Queue Backend*

Jobs are rows in a Postgres/SQLite table; workers SELECT FOR UPDATE SKIP LOCKED.
Durable queue on Redis with mature worker libraries.
Purpose-built broker with durability, backpressure, and multi-consumer support.

Required Capabilities*

Recurring jobs (nightly emails, weekly reports) defined in code or a UI.
Enqueue a job to run at a specific future timestamp (trial expiry, reminder emails).
Separate queues for high-priority (user-triggered exports) vs bulk (nightly batches).
Throttle jobs that call rate-limited external APIs.

Failure & Durability*

Failed jobs retry with increasing delay; after N attempts move to a dead-letter queue for human review.
Jobs are enqueued inside DB transactions; handlers are idempotent so safe retries do not duplicate effects.

Tradeoffs

ScalabilityDatabase-backed queue at high volume

Primary DB absorbs queue write load; row-level locks contend with application queries

ComplexityRedis-backed queue without outbox

Enqueue happens outside DB transaction — jobs can run for state that was rolled back

ComplexityTransactional outbox chosen

Additional table, polling worker, and idempotency discipline — the payoff is no duplicated side effects

Rate Limiting & Abuse Preventionmedium

Rate Limit Algorithm*

Count requests per user per minute; reset at the minute boundary.
Counts over a rolling window to eliminate boundary bursts.
Each identity has a bucket that refills at a steady rate; each request consumes a token.

What to Limit By*

Limits keyed on client IP.
Limits keyed on user ID or API key.
Expensive endpoints (search, export) get lower limits than cheap ones.
Aggregate limit across all users in a workspace.

Abuse Prevention Layer

hCaptcha / Turnstile on signup, login, and password reset — triggered on threshold breach, not every request.
ML-driven client fingerprinting to score requests as human/bot before routing.
Cloudflare, AWS WAF, or Fastly with OWASP and bot rule packs in front of your origin.

Response Behavior*

Return HTTP 429 with Retry-After, X-RateLimit-Limit, and X-RateLimit-Remaining so clients back off correctly.
After threshold, delay responses by seconds instead of returning 429.

Tradeoffs

UXPer-IP limits only

False positives behind corporate NATs; attackers bypass with rotating proxies

ScalabilityPer-tenant limits enabled

Noisy-neighbor protection — one tenant cannot starve others

CostToken bucket chosen

Allows bursts but requires a per-identity bucket state in Redis — higher memory footprint

LatencyWAF added in front

Meaningful latency cost at the edge if the WAF is geographically distant from users

Onboarding & Activationmedium

Onboarding Format*

No dedicated onboarding UI — every empty screen contains a clear primary action pointing to the next step.
A dismissible checklist ("Invite teammate • Create project • Connect integration") visible until complete.
Step-through overlays point at UI elements on first use.
User cannot access the product until they complete N configuration screens.

Personalization Signals

Ask one or two questions to route the user to a tailored first experience.
Present starter templates ('Blank', 'Team docs', 'Marketing site') as the first interaction.
Every new workspace starts with an example project the user can play with.

Activation Support

Email nudges when a user signed up but has not yet hit the key activation action (e.g. created their first project).
A persistent help button that opens relevant docs or a short walkthrough based on the current page.
Intercom-style chat surface active for new users in their first few days.

Tradeoffs

UXSetup wizard as format

Controls first-experience but introduces sign-up drop-off proportional to wizard length

ComplexityActivation emails enabled

Requires event tracking + scheduled jobs + segmentation infrastructure

CostLive chat during onboarding

Staffing cost scales with signup volume — not viable for self-serve products below a certain ACV