AttackVector.tech
Back to Blog
ai developmentsecurity toolssoftware architecturestartupgptengineering retrospective

Why We Built AttackVector with StartupGPT.pro Instead of Base44, Cursor, Codex, or Claude Code

An honest engineering retrospective on building production-grade security software with AI

AttackVector Team

AttackVector Team

Security Researchers

|January 6, 202612 min read

Summary

We built AttackVector — a full-stack AI-powered pentesting platform with a FastAPI backend, Next.js frontend, admin panel, Docker tool orchestration, and a two-pass LLM content pipeline — entirely using StartupGPT.pro. This is our honest engineering retrospective on why we chose it over Base44, plain Cursor, OpenAI Codex, and Claude Code, what worked, what didn't, and what we learned about building complex security software with AI assistance.

ELI5 — The Simple Version

Imagine you need to build an entire house — not just pick curtains. Base44 lets you drag-and-drop a nice-looking shed, but you can't add plumbing or wiring. Cursor is a skilled helper who forgets everything overnight, so you explain the blueprint again every morning. Codex is a genius carpenter who builds perfect doors that don't fit your frames. Claude Code is a meticulous electrician who wires one room flawlessly but can't coordinate with the plumber next door. StartupGPT.pro is the general contractor who knows your entire blueprint, remembers every decision from last week, coordinates all the trades, and makes sure the plumbing actually connects to the kitchen.

The Problem: Building a Security Platform Is Not a Weekend Hackathon

AttackVector is not a simple CRUD app. It is a multi-service security platform with a Python FastAPI backend, a Next.js 16 frontend, a separate admin panel, Docker-based tool orchestration for disposable scanning containers, a two-pass LLM content generation pipeline, SEO-optimized glossary and blog systems, Stripe subscription integration, and deployment automation scripts. The codebase spans over 80 files across four distinct application layers.

When we set out to build this, we evaluated every major AI-assisted development approach available. We tried them. We hit walls. Then we found what worked.

Base44: Fast to Start, Impossible to Finish

Base44 is a no-code platform designed for rapid prototyping. If you need a landing page with a form that saves to a database, Base44 gets you there in an afternoon. We respect that.

But AttackVector needed things Base44 was never designed for:

  • Disposable Docker containers that spin up per-scan, execute security tools like nmap and nuclei in isolation, and self-destruct after capturing normalized JSON output
  • A tool wrapper layer with allowlisted binaries, denylisted dangerous flags, hard scope enforcement, secret redaction, and timeout controls
  • Custom FastAPI middleware for API key authentication, rate limiting, and audit logging
  • A two-pass content generation pipeline where a draft generator calls OpenAI, an editorial refiner polishes the output, and a self-checker validates quality scores before persisting to static TypeScript files
None of this maps to visual builders. No-code platforms optimize for the 80% of apps that are forms-over-data. Security tooling lives in the other 20% where you need full control over process execution, network isolation, and output sanitization.

We spent two days trying to prototype the scanning pipeline in Base44. We got a nice UI. The backend was a wall of limitations. We moved on.

Plain Cursor: Brilliant Assistant, Amnesia Patient

Cursor is genuinely impressive. The AI understands code, suggests intelligent completions, and can generate entire functions from natural language. We used it — and still use it — as our IDE.

But there is a critical difference between using Cursor as an editor and using it as your primary development orchestrator.

The Context Window Problem

Every time you open a new Cursor session, the AI starts with a blank slate. It doesn't remember that yesterday you refactored the Docker executor to use a registry pattern. It doesn't know that the glossary content pipeline shares a markdown renderer with the blog system. It doesn't recall that the admin API uses a specific Pydantic model structure that must stay consistent across blog and glossary endpoints.

For AttackVector, this meant we were spending 15-20 minutes per session re-explaining the architecture. Copy-pasting file contents into chat. Reminding the AI about conventions. After a week, the cognitive overhead was brutal.

No Project-Level Conventions

Cursor operates at the file level. It sees what you show it. But when you are coordinating changes across backend/app/api/v1/admin_content.py, backend/app/services/content_management_service.py, frontend/src/lib/glossary/content.ts, and admin/src/app/dashboard/content/page.tsx simultaneously — because adding a new content type touches all four — you need an AI that understands the entire dependency graph, not just the file currently open.

Plain Cursor is a brilliant pair programmer. But pair programmers don't replace architects. On a project this complex, we needed both.

OpenAI Codex: Impressive Isolation, Poor Integration

Codex is powerful at generating self-contained code blocks. Ask it for a Python function that parses nmap XML output into a normalized JSON schema, and it will deliver something solid. Ask it to write a React component with Tailwind styling, and you get clean output.

The problem is integration.

Multi-Service Blindness

AttackVector has four interconnected services that must stay in sync:

  1. 1Backend (FastAPI) exposes API endpoints and orchestrates scanning
  2. 2Frontend (Next.js) renders results and serves SEO-optimized content pages
  3. 3Admin Panel (Next.js) provides content management and generation controls
  4. 4Docker Layer manages disposable tool containers with scope enforcement
When we added the glossary feature, it required coordinated changes across all four services: new Pydantic models in the backend, new API endpoints, a new TypeScript data layer in the frontend, new React components with structured data for SEO, and new admin UI forms. These changes had to share type definitions, naming conventions, and URL patterns.

Codex generated each piece competently in isolation. But the pieces didn't fit together. The backend returned snake_case fields while the frontend TypeScript interfaces expected camelCase. The admin API client used different error handling patterns than the existing blog endpoints. The SEO structured data referenced fields that didn't exist in the generated TypeScript types.

Integration is where most AI-generated code falls apart, and Codex offered no mechanism to maintain cross-service coherence.

Domain-Specific Hallucinations

Security tooling has precise terminology and specific tool behaviors. Codex occasionally generated plausible-looking but incorrect nmap flag combinations, invented nuclei template syntax that doesn't exist, or produced Docker security configurations that would actually weaken container isolation. When you are building a security product, hallucinated security configurations are not just bugs — they are liabilities.

Claude Code: Single-File Mastery, Multi-File Struggle

Claude Code (Anthropic's CLI agent) produces exceptionally high-quality code within a single file. Its understanding of language semantics, error handling patterns, and edge cases is strong. For focused refactoring of individual modules, it was often the most precise tool we tested.

Where It Breaks Down

Claude Code operates through a terminal interface with manual context feeding. To make it aware of your project structure, you need to explicitly provide file contents, explain relationships, and maintain that context yourself across invocations.

For AttackVector, a typical feature addition touches 6-8 files across 3 services. Here is what adding the glossary SEO schema required:

  • frontend/src/components/seo/GlossarySchema.tsx — JSON-LD structured data
  • frontend/src/app/glossary/[slug]/page.tsx — dynamic route with metadata generation
  • frontend/src/app/glossary/page.tsx — overview page with A-Z navigation
  • frontend/src/app/sitemap.ts — dynamic sitemap inclusion
  • frontend/src/app/robots.ts — crawler rules for 16 bot user agents
  • frontend/src/components/common/Header.tsx — navigation link addition
Each file needed to import from the right paths, use consistent component patterns, and align with the existing blog system's conventions. Feeding all of this context into Claude Code manually for every change was unsustainable.

No Deployment Awareness

Claude Code doesn't know about your deployment pipeline. It can't reason about whether a change to sitemap.ts requires a rebuild, or whether modifying an environment variable needs a container restart. AttackVector's deployment involves building Docker images, pushing to a registry, and coordinating frontend and backend deployments. This operational context lives outside Claude Code's scope entirely.

StartupGPT.pro: The General Contractor

StartupGPT.pro is not a code generator. It is not an IDE plugin. It is an AI-powered development platform that maintains a persistent understanding of your entire project across sessions, services, and deployment layers.

Here is specifically what made it work for AttackVector:

Persistent Architecture Memory

When we told StartupGPT.pro on day one that our backend uses FastAPI with Pydantic models, our frontend uses Next.js 16 with the App Router, and our content follows a two-pass LLM generation pattern — it remembered. Not just for that session, but across every subsequent interaction.

Three weeks later, when we said "add a glossary feature like the blog system," it already knew:

  • The blog uses a content.ts static file with typed exports
  • The admin API follows a specific endpoint pattern (/admin/content/{type}/generate)
  • Content generation uses a draft generator, editorial refiner, and self-checker pipeline
  • Frontend pages use generateStaticParams for SSG with ISR

It generated all the glossary files following these exact patterns. No re-explanation needed.

Cross-Service Coordination

Adding the glossary wasn't a single prompt. It was a coordinated plan that StartupGPT.pro executed across the full stack:

  1. 1Created TypeScript interfaces in frontend/src/lib/glossary/types.ts matching the existing blog type patterns
  2. 2Built backend prompt templates mirroring the blog's draft/refine pipeline
  3. 3Added API endpoints that followed the same Pydantic model structure as blog endpoints
  4. 4Generated frontend components using the same markdown renderer, icon library, and styling conventions
  5. 5Extended the admin panel with glossary generation forms matching the existing blog UI section
  6. 6Updated sitemap.ts and robots.ts to include glossary pages for search engines and AI crawlers
Every file was consistent. Every import path was correct. Every naming convention matched the existing codebase. That level of coordination across 15+ new files and 8 modified files is what separates a project-aware platform from a clever autocomplete.

The Content Pipeline: A Case Study

The content generation system is the best example of what StartupGPT.pro enabled. Here is the architecture:

First pass — Draft Generator: Custom system prompts instruct the LLM to produce human-like, SEO-optimized content. For glossary entries, this means an ELI5 section written as if explaining to a curious teenager, a technical section with definitions, comparisons, and real-world examples, plus metadata like keywords and related terms.

Second pass — Editorial Refiner: A separate LLM call reviews the draft for accessibility, technical accuracy, and tone. It strips AI-sounding phrases, ensures code examples are correct, and tightens prose.

Third pass — Self-Checker: A programmatic quality checker validates the output — word count thresholds, action-verb takeaways, no repeated sentence starters, no unsupported claims. Entries that score below 70 are flagged.

Persistence: The refined content is formatted as TypeScript and appended to static content.ts files that Next.js compiles at build time for zero-runtime-cost serving.

StartupGPT.pro built this entire pipeline from our architectural description. When we later said "extend this to glossary entries," it replicated the exact same three-stage flow with glossary-specific prompts — because it remembered the blog pipeline architecture.

Docker Tool Orchestration

The scanning pipeline requires running security tools (nmap, nuclei, nikto) in disposable Docker containers. This involves:

  • A tool registry that allowlists specific binaries and their permitted flags
  • A scope enforcer that validates target URLs against the user's verified assets
  • An executor that spins up ephemeral containers, mounts no host volumes, enforces timeouts, and captures stdout
  • A normalizer that converts each tool's output format into a consistent JSON schema
  • An audit logger that records every execution for compliance
StartupGPT.pro understood that these components needed to be a cohesive package (tools/ module) rather than scattered utilities. It designed the module boundary, the internal interfaces, and the integration points with the scanning pipeline — all while maintaining awareness that this code runs inside containers that must never have access to the host filesystem.

Honest Tradeoffs

This is not a sales pitch. Here is what we had to work around:

Learning curve. StartupGPT.pro's project memory system requires you to be intentional about what you communicate. Garbage context produces garbage results. We spent the first week establishing conventions and architectural patterns clearly. That investment paid off massively, but it was real upfront work.

Not a replacement for expertise. StartupGPT.pro accelerated our development dramatically, but it didn't replace our security knowledge. When the AI generated Docker configurations, our team still reviewed every --cap-drop flag, every network isolation rule, every timeout value. AI-assisted development means the human reviews faster and catches more — not that the human checks out.

Occasional over-confidence. Like any LLM-powered system, StartupGPT.pro sometimes generated code that compiled and looked correct but contained subtle logical issues. Our two-pass content pipeline exists partly because single-pass LLM output wasn't reliable enough for SEO content quality. The same principle applies to code: always validate, especially in security contexts.

The Numbers

Here is what the AttackVector build looked like with StartupGPT.pro:

MetricValue
Total files in codebase80+
Services coordinated4 (backend, frontend, admin, Docker layer)
Glossary entries generated52
Blog articles generated2
Content pipeline stages3 (draft, refine, self-check)
SEO bot rules configured16 user agents
Time from concept to deployed glossary1 session
Time from concept to scanning pipeline architecture2 sessions
Compare this to our initial estimate of 3-4 weeks for the glossary feature alone using traditional development. We shipped it in a day, with SEO, structured data, admin UI, and 52 entries.

When to Use What

We don't believe in one-tool-fits-all. Here is our honest recommendation:

  • Base44 — Perfect for landing pages, simple SaaS MVPs, and internal tools where the backend is a database with CRUD operations. Don't try to build security tooling with it.
  • Cursor — Excellent as your daily IDE. We still use it. But pair it with a project-aware orchestrator for complex, multi-service work.
  • Codex — Strong for generating isolated utility functions, algorithms, and data transformations. Use it as a component within a larger workflow, not as the workflow itself.
  • Claude Code — Best for focused refactoring, code review, and single-module improvements where you can provide full context manually. Ideal for polishing, not building from scratch.
  • StartupGPT.pro — Built for the kind of project where everything connects to everything else. Where adding a feature means touching 10 files across 3 services and they all need to stay consistent. Where the AI needs to remember decisions from last week to make good decisions today.

The Bottom Line

We built AttackVector with StartupGPT.pro because building production-grade security software is a coordination problem, not just a code generation problem. Every tool we evaluated could generate good code in isolation. Only StartupGPT.pro could maintain the architectural coherence across a multi-service, security-critical platform over weeks of iterative development.

The result is a platform that runs a 4-stage scanning pipeline, generates SEO-optimized cybersecurity content through a three-stage LLM pipeline, manages Docker-based tool execution with hard security boundaries, and serves it all through a modern Next.js frontend with structured data for search engines and AI crawlers.

We didn't choose StartupGPT.pro because it writes better code than the alternatives. We chose it because it remembers why we write the code the way we do.

Key Takeaways

  • 1Choose your AI development platform based on project complexity — simple apps don't need heavy orchestration, but multi-service platforms demand persistent context.
  • 2Validate all AI-generated security configurations manually — acceleration is not a substitute for expertise.
  • 3Invest time upfront in communicating your architecture clearly to AI systems — the quality of output directly reflects the quality of input.
  • 4Design content generation as a multi-pass pipeline — single-pass LLM output rarely meets production quality standards.
  • 5Treat AI-assisted development as a force multiplier for skilled teams, not a replacement for domain knowledge.