> smyan.boot()

Building agents that steer themselves through messy workflows.

I work on agent harnesses, eval suites, and non-deterministic orchestration — systems where ML models need guardrails, not scripts.

system.status = online
employer = "Alix"
role = "AI Engineer"
focus = ["agents", "evals", "self-guiding agents", "harnesses", "non-deterministic workflows", "ml systems"]
location = "San Francisco"
mode = "building"_

> status.check

AI Engineer @ AlixAgent EvalsSelf-Guiding AgentsAgent HarnessesNon-Deterministic WorkflowsML Systems

> work.log

[01]

Agent Evaluation Harness

Python / Evals / Agents / ML / CI/CD

Built an eval framework at Alix for scoring agent trajectories — tool-use correctness, task completion, regression gates on golden runs, and drift detection across model and prompt changes.

Catches behavioral regressions before deploy with automated trajectory scoring and pass/fail gates.

[02]

Self-Guiding Agent Runtime

Agents / LLM / Python / ML / Reflection

Designed self-guiding agents that plan, execute, reflect, and re-route without hardcoded step sequences. Agents observe intermediate state, critique their own output, and adjust course mid-run.

Handles open-ended tasks where the path isn't known upfront — no fixed DAG required.

[03]

Agent Harness & Tool Orchestration

Harnesses / Agents / Python / TypeScript / MCP

Built a harness layer that wraps agents with structured tool access, retry policies, timeout budgets, and observability hooks. Standardizes how agents call APIs, query data, and hand off between sub-agents.

Single interface for spinning up, monitoring, and tearing down agent sessions across workflows.

[04]

Non-Deterministic Workflow Engine

Workflows / Agents / Python / Orchestration / ML

Orchestrates workflows where each run can branch differently — stochastic agent decisions, parallel exploration paths, replay for debugging, and checkpointing for long-running jobs.

Supports branching, replay, and partial reruns without restarting the entire workflow.

[05]

Multi-Agent Financial Reporting Pipeline

Python / LangChain / Agents / ML / Evals

Designed and deployed a multi-agent financial reporting pipeline at EY using Python, LangChain, and Azure OpenAI — orchestrator, analyst, and reviewer roles with eval checks on output quality.

Cut analyst drafting time by approximately 40% across 3 engagement teams.

> capabilities.scan

Agent evaluation

Trajectory scoring, golden-run regression, behavioral drift detection, and CI gates for agent releases.

Self-guiding agents

Agents that plan, reflect, and re-route — built for tasks without fixed step sequences.

Agent harnesses

Tool orchestration, retry policies, session lifecycle, and observability for production agent runs.

Non-deterministic workflows

Branching orchestration, replay, checkpointing, and partial reruns for stochastic agent pipelines.

ML systems

Model integration, anomaly detection, pipeline monitoring, and production ML infrastructure.

Data engineering

Batch and streaming pipelines across Airflow, Kafka, Spark, Snowflake, and Azure.

> about.read

I'm a San Francisco-based engineer at Alix, building agent harnesses, evaluation frameworks, and orchestration for non-deterministic workflows. The hard part isn't calling an LLM — it's making agents reliable when every run looks different. Before Alix, I built multi-agent systems and ML pipelines at EY and MAI Capital Management — financial reporting agents, anomaly detection, and production data infrastructure across Airflow, Snowflake, Kafka, and Spark. I like simple interfaces, strong backends, and systems that hold up when the model does something unexpected.

> contact.open()

Building agents, evals, or workflows that need to survive production?