Back to Jobs

Full-stack AI developer to set up a secure local GPT-like assistant with doc analysis and memory.

Remote, USA Full-time Posted 2025-07-27

Seeking a highly competent AI engineer to build a fully local, modular AI assistant designed for legal analysis, document synthesis, clinical planning, and research-intensive writing. This is a mission-critical cognitive environment, not a chatbot or LLM toy.

The system must run entirely offline, preserve compartmentalized memory, and support vector-based document ingestion — all deployed on a MacBook Pro M4 Max (64GB RAM, 2TB SSD). It must be auditable, inspectable, and fully under user control at every layer.

Build Scope (Three Phases – One Builder Only)

You will be contracted to deliver the entire architecture across three phases:

    Phase 1 — ("Legal & Tax Office")
  • LM Studio / llama.cpp optimized for Apple Silicon
  • Vector DB setup (Chroma/Qdrant/Weaviate) with local document parsing
  • UI (Streamlit, Electron, or lightweight shell)
  • PDF/DOCX ingestion with embedded citation + source trace
  • Memory persistence: prompt saving, tone tagging, output control
  • Output styles: affidavit, brief, memo, etc.
  • No cloud sync. No logging. Fully local.
    Phase 2 — ("The Board Room") Multi-Domain Strategic Workspace
  • Add support for clinical workflows, estate planning, strategic writing
  • Improved memory containers (per domain)
  • API access wrapper to GPT-4 or Claude 3 (via user-owned keys ONLY)
  • Enhanced persona switching, tone locks, exportable sessions
    Phase 3 — ("The Den") Private Cognitive Module
  • Separate vault for high-context reasoning and longform writing
  • Firewalled from Office; no cross-contamination
  • Persistent memory, stylistic voice preservation, session archiving
  • Local-only inference with optional narrative assistive tools
  • Optional future integration: multimodal parsing, encryption layers
    Security & Audit Clauses (Non-Negotiable)
  • You must consent to third-party code audit after each phase
  • No proprietary wrappers
  • No API calls unless explicitly declared and authorized
  • All configuration paths, prompts, vector stores, and memory logs must be accessible to the user
  • You will provide:
  • Full install scripts
  • Summary.txt documenting build logic, parameters, dependencies
  • Notes to replicate install from clean OS

Payment Structure

This is an hourly project with a firm "not-to-exceed" ceiling, payable at the end of each phase upon satisfactory completion and code audit.

    Terms:
  • You (the builder) are committing to the entire 3-phase build, unless mutually terminated.
  • I (the client) commit to paying for each completed phase promptly and fully once:
  • Deliverables are met
  • The audit review is cleared
  • I reserve the right to end the project after any phase, for any reason, without obligation to proceed further.
  • You agree that no partial or hidden ownership exists — all build components, logic, and configuration must be shared in full and are under the client’s control.

Summary:

PhaseDeliverable Payment Exit Clause

Phase 1 Inference + UI Paid after audit Client may exit

Phase 2 Vector Store + API logic Paid after audit Client may exit

Phase 3 Final Memory Shell + Vault Split Paid after audit Final

Please submit your hourly rate and your not-to-exceed cap (USD) for the full build.

System Context

You’ll be working on a clean Apple M4 Max MacBook Pro (64GB RAM, 2TB SSD), macOS Sonoma.

    Pre-installed:
  • LM Studio
  • Homebrew
  • Git
    Available access:
  • AnyDesk (if needed)
  • Local data files (legal PDFs, notes, DOCXs, etc.)

Preferred Stack (flexible with reasoned alternatives)

Layer Preferred Tool

Inference LM Studio / llama.cpp

Vector DB Chroma or Qdrant

File Parsing PyMuPDF, unstructured.io

UI Streamlit or Electron

Memory Logic JSON store or LangGraph

API Gateway (Phase 2–3) GPT-4/Claude via secure keys

    Models to install/test:
  • MythoMax 13B Q5
  • Chronos Hermes 13B
  • Mistral 7B Instruct
(Final quantization/tuning TBD based on your advice)
    ✅ You Should Apply If You:
  • Have built secure local GPT-style systems before
  • Know vector stores, LLM inference, and UI wiring well
  • Understand data compartmentalization
  • Are comfortable being audited
  • Can document your work cleanly
  • Are discreet and professional in high-trust builds
    Bonus if you’ve worked with:
  • Apple Silicon
  • Clinical/legal data environments
  • High-context cognitive tools (not just chat interfaces)

How to Apply

In your message, please include:

1. A short list of similar builds (esp. local/private GPT)

2. Your preferred stack and why

3. Your hourly rate

4. Your not-to-exceed ceiling for all three phases

5. Confirmation you accept the audit clause and early termination terms

Apply Job!

Apply to this Job

Similar Jobs