AI BaZi Analysis SaaS
AI fortune-telling analysis SaaS: Async architecture + payment closed-loop + GDPR-native compliance, zero-downtime frontend and backend refactoring.
P99 response improved from ~90s wait to instant return (async tasks + three-phase API).
Key API p95 ≈ < 600ms; task status query < 100ms.
Email delivery rate improved from unstable SMTP → Mailgun HTTP 99.9%+.

- My Role: Independent Product Builder
- I led the entire lifecycle from product concept to launch: from market research, prototype design, tech stack selection to full-stack development and deployment, independently implementing Java backend refactoring, Nuxt frontend refactoring, Stripe payment closed-loop, GDPR compliance framework, and cloud-native deployment.
- This is a production-grade SaaS product entirely designed, developed, and delivered by me independently, proving my ability to transform complex ideas into sustainable commercial systems.
Background & Goals
- Prototype used FastAPI, long AI processing (30–90s) caused poor synchronous waiting, token expiration, concurrency limitations; needed enterprise-grade scaling and compliance.
- Goal: Refactor to enterprise-grade (payment closed-loop, internationalization, GDPR compliance, observability), seamless frontend migration with continuous operations.
- Risk points: Cloud platform SMTP limitations, PostgreSQL SSL connection configuration, Render memory limits, AI volatility and consistency.
Architecture
Architecture: Frontend (Nuxt 3) → API Gateway (Spring Boot) → Business Services (async engine, translation service, payment service) → PostgreSQL / Redis → Third-party (Stripe/Mailgun/Gemini)
Technical Highlights:
1. Three-tier API design (BFF → Core → External)
Isolates frontend changes through BFF (Backend for Frontend) layer, Core API focuses on core business stability, External API manages third-party services uniformly, achieving high cohesion and low coupling.
2. Async task processing (@Async + CompletableFuture + PostgreSQL state table)
Zero extra cost, simplified architecture. AI deep analysis (30-90s) uses async processing, returns task ID immediately, client polls for results. Task creation <600ms, status query <100ms, improved user experience.
3. Idempotency design
All external callback interfaces (e.g., payment notifications) implement idempotency based on transaction_id or nonce, preventing duplicate processing due to network retries.
4. Database row-level security (RLS)
Set up row-level security policies (RLS) in Supabase (PostgreSQL) for users, deep_jobs, deep_reports, and transactions tables, ensuring users can only access their own data, providing security at the database level.
Results & Metrics
Performance & Stability:
| 指标 | 数值 |
|---|---|
Login p95 | ≈ 450ms |
Create Task p95 | ≈ 520–600ms |
Status Query p95 | ≈ 85ms |
Webhook | ≈ 185ms |
Translation Hit Rate Three-tier cache: Frontend memory → DB → LLM writeback self-healing | >95% |
HikariCP Connection Leak Threshold & pool size tuning | Eliminated |
Business Impact:
- Payment success rate >99% (Checkout + Webhook idempotency; automatic refund on failure).
- Email delivery 99.9%+ (SMTP → Mailgun HTTP).
Key Code & Engineering Practices
GeminiAnalyzerService (Async Core)
CompletableFuture + five-stage state machine + three automatic refund compensations, reducing user wait from 90s→instant return.
- ✨ AI Consistency Locking: Proactively set Gemini temperature parameter to 0.1 (near zero), solving AI result random fluctuation, improving result consistency from ~70% to 99%+, ensuring service professionalism and reliability.
- ✨ Defensive JSON Validation: Before saving AI results, perform integrity validation through custom isValidCompleteJson method. Result: 100% prevention of truncated JSON data polluting the database due to API timeout or network issues, ensuring data quality.
backend-java/services/GeminiAnalyzerService.javaDatabaseSequenceFixService (Self-healing)
Automatically scan and repair PostgreSQL auto-increment sequences on startup, zero manual intervention, integrated with Actuator health check, auto-repair on startup, and log repair records for ops monitoring.
backend-java/services/DatabaseSequenceFixService.javaAPI Contract
- JWT stateless authentication (30 min expiration, Header transmission, CORS-friendly); refresh mechanism reserved.
- Endpoint permission matrix: Public registration/login/email verification; task status is public query (progress only), report retrieval requires authentication.
- Error model & retry/idempotency: Webhook signature verification, session/intent ID unique index, duplicate callback safety.
Data Model
- Core entities:
USERS / DEEP_JOBS / DEEP_REPORTS / TRANSACTIONS / TRANSLATIONS. - Task—Report 1:1, User—Task/Report/Transaction 1:N; transaction records associate with Stripe session/intent ID; translation table with unique key and usage count.
- Indexes & hot queries: User login unique index, task list compound/covering index, transaction unique key.
- Stateless quota self-healing: Designed
ai_deep_used_todayfield inuserstable, combined with business logicresetDailyQuotaIfNeeded(), achieving automatic daily quota reset on query. Advantage: No dependency on external Cron Job or scheduled tasks, simplified architecture, reduced ops cost and single point of failure risk.
Security & Privacy
- Data map & lifecycle: Collection→Transmission→Storage→Access→Destruction (table + timeframe), minimization principle and transparent disclosure.
- Transmission/storage security: TLS1.3, JWT signing, DB encryption, sensitive field masking, Webhook signature verification and key rotation.
- Cookie policy: Necessary/functional grading, functional enabled after consent.
Performance & Scalability
- Multi-tier caching: Frontend memory preload → Database query → LLM self-healing writeback, 95%+ hit rate.
- Capacity planning:
- - Precise JVM tuning: Under Render free tier 1GB memory limit, through JVM `-XX:MaxRAMPercentage=70.0`, dynamically set heap memory upper limit to 70% of container available memory, reserving 30% for Metaspace and system processes. Result: Memory usage optimized from 85% to 65%, avoided OOM risk, maximized resource utilization.
- - Connection pool proactive defense: Not only resolved connection leak through HikariCP optimization, but also set
leak-detection-threshold: 90000(90s), precisely matching AI task max duration, achieved proactive monitoring and early warning for potential long connections, ensuring database health. - Async benefits: Reduced user perceived wait from 90s → 0s, improved system throughput, polling endpoint <100ms.
Architecture Decision Records (ADR)
- FastAPI → Spring Boot 3 trade-offs (ecosystem/concurrency model/observability/team structure).
- SMTP → Mailgun HTTP API (cloud platform port restrictions vs delivery rate stability).
- Three-phase API (experience vs security, report retrieval re-authentication).
My Role
- Designed and implemented full-stack architecture (Spring Boot 3 + Nuxt 3)
- Developed GDPR compliance framework (consent management, data lifecycle control)
- Integrated Gemini 2.5 Pro + custom Prompt engine
- Built payment and account closed-loop (Stripe)
- Deployed to production environment (Vercel + Render, 99.9% availability)
- Led user research and UX iterative optimization
Next Steps
- ✅ Implemented: Actuator health check, Docker multi-stage build, JVM tuning
- 🔜 Planned: Refresh Token, Prometheus monitoring, k6 stress testing, canary release