Joseph Anthony Pasquale Holsten

Burlington, WA 98233 US
48°29′N 122°19′W

Technical Expertise

Infrastructure as Code & Automation
Terraform, Ansible, Puppet, Chef
Cloud Platforms & Compute
Oracle Cloud Infrastructure, Azure, AWS, GCP, Kubernetes, Docker
Observability & Reliability
Prometheus, Grafana, Sensu, OpenTelemetry, New Relic, PagerDuty, distributed tracing, incident response, SLO/SRE practices
AI/ML Infrastructure
LLM model serving, SLM fine-tuning (Axolotl/SFT), PydanticAI, RAG architecture, model operations, inference pipelines
Distributed Systems & Data
Kafka, RabbitMQ, PostgreSQL, Redis, S3/object storage, ELK stack, data pipelines
Programming & Scripting
Go, Python, Ruby, Shell scripting, Infrastructure SDKs

Experience

Fractional Director of Engineering, Zoning Intelligence Platform (Stealth) 2026-01
- Designing and building an end-to-end legal document ingestion pipeline that normalizes municipal and regional ordinances from 22+ jurisdictions into Akoma Ntoso (legal XML), constructs a knowledge graph via NLP, and exposes structured graph context to a fine-tuned SLM at inference time, replacing flat embedding-based retrieval with document-hierarchy-preserving intelligence.
- Designed proprietary hierarchical tree indexing for legal code navigation (purpose-built for ordinance structure, analogous to PageIndex), enabling LLM-based tree traversal that preserves title/chapter/section hierarchy rather than collapsing to embedding chunks.
- Implementing QLoRA fine-tuning on a self-hosted Qwen 3.5 SLM (OCI, Nvidia GPUs), using distillation as a bridge while land use attorneys and paralegals develop domain training data; collecting user feedback for future RLHF alignment.
- Established a deliberate compliance architecture that excludes customer documents from training and inference entirely; this intentional market segment decision eliminates data liability exposure from the ML stack and keeps the platform outside CJIS and PII-heavy use cases.
- Driving benchmark-driven technology selection across the ML stack: evaluating NLP parsers (Stanza, CoreNLP, AllenAI SRL), graph databases (Memgraph, Kuzu, Neo4j, FalkorDB, Apache AGE), and base models (Qwen, OLMo, Phi, Gemma) against the legal domain rather than general leaderboard rankings.
Fractional Director of Engineering, Guardrail Technologies 2025-10 – 2026-01
- Fixed a fundamental security flaw in the PII redaction pipeline: the system was sending the values it was contracted to protect to a commercial LLM vendor for pattern detection; replaced LLM-based detection with per-customer regex rules stored in MongoDB, eliminating the exfiltration entirely.
- Stabilized a vibe-coded AI safety prototype to deliver the first paying customer by contract deadline: rebuilt the application delivery pipeline to enable reproducible environment promotion, then shipped the three contractually required features (conversation history fix, Claude LLM backend, configurable data redaction policies).
- Implemented synthetic service availability monitoring and isolated staging/production DNS as CAPA after a staging deployment took down production; no further production outages occurred through end of engagement.
- Extended the platform's LLM backend to support OpenAI (GPT-4o, GPT-5), Anthropic Claude, and self-hosted Ollama models (Llama, Mistral, DeepSeek), enabling provider selection by compliance requirement; led VRAM and inference-speed analysis to migrate self-hosted inference from third-party GPU hosting to a right-sized Azure NC24ads A100 v4, cutting costs and eliminating the markup.
Senior Technical Program Manager, Oracle Cloud Infrastructure, Compute 2021-10 – 2024-11
- Spearheaded development of an operational response process for a 35-member service team to address incident recurrence and improve time-to-mitigation.
- Designed and scaled a major incident response process for a 500+ member unit, introducing documentation standards and automated repair frameworks, enhancing collaboration and efficiency.
- Led operability team in creating feedback cycles and automated repair systems for distributed compute systems, enabling scalable incident response for backend systems impacting thousands of instances.
- Contributed to foundational infrastructure for Kubernetes-based workloads on Oracle Cloud, benefiting high-profile customers like TikTok and xAI, enhancing service reliability and scalability.
Senior Engineer, Oracle Cloud Infrastructure, Compute 2019-09 – 2021-10
- Eliminated instance downtime by redesigning core API services to support in-place updates, scaling from dozens of beta users requiring manual intervention to thousands of automated daily updates across production infrastructure.
- Managed Gartner Magic Quadrant evaluation process for Compute service, navigating cross-organizational dependencies (Linux, Storage, Networking, Images) to deliver performance benchmarks under analyst scrutiny—validating infrastructure could handle 1000 simultaneous launches.
- Prevented customer-impacting release delays during organizational upheaval by owning end-to-end review and coordination of Terraform provider, SDK, and CLI changes—keeping deliverables on track while peer teams experienced significant schedule slippage.
- Reduced customer support burden through deep performance optimization of foundational Go SDK, enabling self-service troubleshooting from application layer down to OS-level configuration for services like Oracle Kubernetes Engine.
Senior Solution Architect, Oracle Cloud Infrastructure, Compute 2018-05 – 2019-09
- Accelerated enterprise customer adoption by developing reusable infrastructure patterns (Active Directory integration, Elasticsearch deployment, custom Linux migrations) using Python, Terraform, Ansible, and Cloud-Init, reducing customer onboarding time and support escalations.
- Unblocked Oracle Global Business Unit migration to OCI by creating safe instance modification tooling which enabled rapid prototyping and removed deployment bottlenecks blocking Oracle Fusion Apps team adoption.
- Delivered VMware-on-OCI Terraform solution under high-stakes deadline for Oracle OpenWorld 2019 on-stage demonstration and Oracle-VMware partnership announcement, meeting critical go-to-market timeline.
- Established partner image security standards and review process, personally reviewing dozens of images and preventing multiple critical vulnerabilities (hardcoded credentials, SSH keys, unsafe permissions), then built and trained dedicated team to maintain quality enforcement.
Site Reliability Engineer, Private Internet Access 2017-08 – 2018-03
- Eliminated customer-reported outages by implementing Prometheus/Grafana monitoring for 3000+ servers across 45 datacenters, replacing reactive support-ticket-based detection with proactive automated failure detection.
- Established foundational source control and CI/CD by deploying GitLab and GitLab CI, eliminating single-server code storage risks and enabling versioned deployments where none existed previously.
- Developed standardized cross-datacenter provisioning pipelines and decommissioning procedures, ensuring secure credential management and consistent deployment practices across distributed infrastructure.
- Designed incident response procedures to reduce MTTR and coordinate remediation across geographically distributed VPN infrastructure serving privacy-critical customer workloads.
Platform Engineer (Contract), Oracle Cloud Infrastructure 2017-03 – 2017-07
- Delivered foundational Go SDK and Terraform provider for Oracle Cloud Infrastructure's competitive market launch (November 2017), enabling configuration-as-code capabilities essential for enterprise customer acquisition against AWS.
- Built production-ready infrastructure tooling from minimal documentation during private beta, coordinating across Oracle service teams to define API contracts, ensure idempotent operations, and establish sustainable development practices for future team scaling.
- Designed core resource support (networking, compute, load balancing) with Hashicorp integration standards, then trained newly-formed Oracle team on maintenance workflows, establishing foundation now used by OpenAI, NVIDIA DGX Cloud, and TikTok for AI/ML workloads.
Infrastructure Engineer, Ensighten 2014-11 – 2017-03
- Automated hybrid cloud infrastructure managing 1,300 servers across 14 datacenters and 3 cloud providers using Terraform, Puppet, and Ansible, enabling multi-cloud deployment flexibility and eliminating vendor lock-in for CDN and customer data platform services.
- Designed cross-region monitoring and autoscaling using Sensu/Graphite to ensure high availability across geographically distributed infrastructure supporting customer-facing CDN services.
- Led 90-day DNS migration converting 2,000 manually-managed records to infrastructure-as-code, directing 2-engineer team to build open-source Terraform providers for UltraDNS and NS1, preventing production errors through automated testing and detecting unauthorized configuration changes.
- Optimized Kafka infrastructure for Customer Data Platform by contributing custom RoundRobinAssignor to open-source project, enabling dynamic cluster scaling during maintenance without data rejection, reducing infrastructure costs through right-sized capacity.
Infrastructure Engineer, Simply Measured 2012-02 – 2014-08
- Engineered and maintained scalable infrastructure for 500 servers, achieving 150:1 server-to-admin ratio and ensuring high availability.
- Developed monitoring tools and metrics services, reducing downtime and enhancing operational efficiency. Streamlined 100+ SOPs, led training for 45 engineers, and optimized on-call processes, improving response times.
- Implemented distributed processing using Resque/Ruby, reducing data processing time. Collaborated with cross-functional teams to integrate infrastructure solutions, improving deployment success rates.
Further experience available upon request

Open Source & Community Contributions

Most of my open source contributions are visible at github.com/josephholsten,
off-github contributions follow:

Founder, Board Member, Sous Chefs 2015 – 2019 [github-org, torch-passing]
Contributor, terraform 2014 – Present [OCI, NS1, UltraDNS, core]
Contributor, glibc 2014 [commitdiff]
Contributor, go 2013 [commitdiff]
Contributor, OASIS 2008 – 2010 [XRD]
Contributor, IETF 2009 [Disco, About:]
Contributor, OAuth Extensions 2008 [Session, Language, Scalable]
Contributor, Ruby on Rails 2008 [XmlMini]

Presentations

Sous Chefs - Fostering Better Community Cookbooks, Chef Conf [video] 2017-05-24
Sous Chefs, Food Fight Show [video] 2017-01-06
Saturday Night Disco on Monday Morning, DevOpsDays Austin [slides] 2014-05-06
Becoming Fearlessly Definite & Resourceful, ChefConf [video, slides] 2014-04-17
Services in Go: from Proof-of-Concept to Production, GolangVan [video, slides] 2014-02-05

Education

BS Computer Science, Oklahoma State University, Tulsa, OK 2005 – 2010

Technical Expertise

Infrastructure as Code & Automation

Cloud Platforms & Compute

Observability & Reliability

AI/ML Infrastructure

Distributed Systems & Data

Programming & Scripting

Experience

Fractional Director of Engineering, Zoning Intelligence Platform (Stealth) 2026-01

Fractional Director of Engineering, Guardrail Technologies 2025-10 – 2026-01

Senior Technical Program Manager, Oracle Cloud Infrastructure, Compute 2021-10 – 2024-11

Senior Engineer, Oracle Cloud Infrastructure, Compute 2019-09 – 2021-10

Senior Solution Architect, Oracle Cloud Infrastructure, Compute 2018-05 – 2019-09

Site Reliability Engineer, Private Internet Access 2017-08 – 2018-03

Platform Engineer (Contract), Oracle Cloud Infrastructure 2017-03 – 2017-07

Infrastructure Engineer, Ensighten 2014-11 – 2017-03

Infrastructure Engineer, Simply Measured 2012-02 – 2014-08

Open Source & Community Contributions

Presentations

Education