An automation app earns trust by what it refuses to do.
Watchflows is automation with real teeth: it watches folders, moves and deletes files, runs scripts, and calls AI models. You should not extend that kind of access on vibes. So instead of asking for trust, we'd rather show our work: what stays on your machine, what touches the network, and what stands between a misconfigured workflow and a bad afternoon. Where there are limits, they're listed too, in what we won't claim at the bottom.
Every workflow and execution record lives in a SQLite database inside the app's Application Support folder on your Mac; credentials live in the macOS Keychain and UI preferences in macOS defaults. There is no Watchflows server account and no cloud sync of your data. Your license is verified the same way, entirely on your machine, using Curve25519 EdDSA (Ed25519) public-key cryptography. Activation never contacts a server, works offline, and requires no account. The 14-day trial is enforced locally too, with no server check-in.
A license key is a WF- prefixed base64url JSON payload of {email, ts, v, sig}. Verification reconstructs the exact signed byte string and checks the Ed25519 signature with Apple's CryptoKit. Timestamps outside a sane window are rejected with the same generic error as a bad signature, so the timestamp check can't be used as an oracle. There is no debug or release bypass of verification: any decode failure is a rejection, and a dedicated bypass-test suite pins that behavior.
One deliberate quirk worth explaining: the activated license is cached as a file in Application Support, not the Keychain. The default keychain ACL for non-sandboxed Developer ID apps ties items to the per-binary code hash, which silently deactivated users on every update. The storage location was never the security gate anyway; the Ed25519 signature check is. (AI provider API keys, by contrast, are in the Keychain, the two are distinct on purpose.)
The app initiates network connections in exactly three situations: the Sparkle update check against the appcast at watchflows.app, calls to whichever AI provider you configure (which can be fully local via Ollama or LM Studio), and workflow nodes you explicitly build that make requests, like API Request or the AI Agent's web tools. Nothing else phones home. The app binary contains no analytics, telemetry, or crash-reporting SDK; its only third-party dependency is Sparkle, the open-source updater.
That three-situation claim comes from an exhaustive grep, not a feeling. Every URLSession call site in the codebase falls into one of: the update availability checker, the AI provider services (Ollama, LM Studio, OpenAI-compatible, Anthropic, Google, plus a reachability probe), the AI Agent's web fetch and search tools, and the API Request executor. The daily update check fetches only the appcast XML; the feed URL and the EdDSA public key are baked into Info.plist at build time.
Two scoping notes. First, the update check is on by default and hits watchflows.app daily. "No telemetry" does not mean "no connections." Second, the user-configured API Request action accepts any URL you give it, including LAN and localhost addresses, by design. And to be precise about scope: this website uses ordinary web analytics. The app does not.
AI provider API keys are stored in the macOS Keychain, never in the workflow database or in plain files. Updates arrive via Sparkle 2 with EdDSA-signed releases: the app will only install an update whose signature verifies against the public key compiled into it. Every release build is Developer ID signed, hardened-runtime, and notarized by Apple.
Provider configs store only a reference to the key; the key material itself lives under the app's Keychain service entry and is resolved at call time. The webhook trigger handles its own secrets too: its HTTP listener binds to loopback only by default, and binding to the LAN is refused unless every webhook on the listener requires authentication with a secret of at least 16 characters. Secret comparison is constant-time, to close off timing attacks. None of this makes a webhook internet-safe; the listener is never internet-facing unless you set up your own port forwarding, which the app neither does nor secures. The AI Agents (MCP) server follows the same discipline, more strictly: it stays off until you enable it, binds 127.0.0.1 only, and every request must present a 43-character bearer token persisted across launches (minted on first run, replaced only by an explicit Rotate) — it is never internet-facing and adds no outbound network calls.
Every file-writing workflow action (move, rename, delete, write, zip, unzip, log-to-file, trash) validates its path before touching disk. Paths inside sensitive locations are refused: system directories, shell and credential config files, and the running app bundle itself. So are paths that use .. traversal tricks. A quarantined .watchflow file from outside your machine (opened by double-click, Finder, or AirDrop) requires explicit confirmation before it can arm any node capable of destructive actions, so a stranger's file can't silently run commands as you.
Path validation runs on the fully rendered path, before any filesystem I/O. It rejects .. segments both before and after URL standardization. Tildes are expanded; symlinks are resolved, so a symlink can't smuggle a path out of an allowed area; and LLM artifacts like wrapping quotes are stripped before validation. The same resolver is shared by all eight file executors and all three of the AI Agent's file tools, one validation path, not eleven copies.
The import gate's danger set isn't hand-maintained: it's derived from the node registry's own destructive flags, so a new destructive node is covered automatically. The Run Script node is the edge: it executes whatever you wrote, with your full privileges, unrestricted, that's its job, and exactly why the import trust gate exists.
The classic automation failure: a workflow watches a folder, its own action modifies that folder, and it triggers itself forever. Two independent layers stop this. A self-event suppressor makes a workflow's file trigger ignore events its own actions just caused. And a circuit breaker watches for a fire→run→fire chain and automatically pauses the runaway workflow with a visible warning. These layers exist because this exact bug happened in the field, we'd rather show you the fix than pretend it never did.
The interesting part is telling a runaway loop apart from you dragging fifty files into a watched folder. The breaker keys on shape: a loop is strict alternation, fire, run completes, fire again within 10 seconds. Five consecutive chain links trips it; any break in the pattern resets the count. A bulk drop produces overlapping fires (the opposite shape) and never trips. Timer triggers are exempt, because a 5-second timer is a legitimate serial chain.
There's belt-and-suspenders below that: Move and Rename are idempotent at the executor level, and folder watches can be scoped non-recursively, which fixed a field-reported loop where a workflow watching Downloads moved files into Downloads/Apps and re-triggered on its own output. The coverage has edges: a script node mutating files slips past the self-event suppressor, though the circuit breaker still catches a runaway, and exotic constructions remain your responsibility.
A Watchflows workflow is literally a directed acyclic graph: nodes connected by wires, and the editor refuses any connection that would form a cycle, a workflow can never chase itself in a circle by construction. There are seven node categories and 56 built-in node types. Data moves as a payload, a bag of key/value pairs. Each node merges what arrives, adds its own outputs, and passes the combined result downstream, so a filename captured by the trigger is still there five nodes later.
Cycle prevention is a breadth-first search run at wire-drop time: before a connection is added, the app walks downstream from the target node, and if it can reach the source, the wire is rejected. The engine doesn't trust the editor, either. At run time a topological sort (Kahn's algorithm) that can't order every node aborts the run. Fan-in is allowed: multiple wires into one inlet merge their payloads, last write wins, before the node runs. Payload keys support dot-path traversal into nested JSON objects and arrays — including array indexing by position, like {{results.0.snippet}} or {{items.2}} — plus transforms like .length, .count, .uppercase, .lowercase, .trimmed, .first, .last, and .keys.
Execution is event-driven: when a trigger fires, only the part of the graph reachable downstream from that trigger runs; everything else is marked skipped. The Condition node routes each payload out a Yes or No outlet, with ten operators. If a node fails, everything downstream of it is skipped rather than run with bad data, and the run is marked failed. And every top-level node execution produces a record (input, output, timestamps, status, error) stored as observable history you can replay in the timeline; nodes inside a loop record their final iteration.
The engine is a Swift actor using dataflow scheduling, not lockstep levels: a node launches the moment all of its upstream dependencies have completed or been skipped, so independent branches naturally run concurrently inside a structured-concurrency task group. Skips cascade properly: a downstream node is skipped only when all of its incoming paths are dead, so a diamond-shaped graph behaves correctly. Conditions write a human-readable audit of every rule evaluation into the payload as _eval.N keys.
Loops are bounded by design. For Each fans out over an array via a dedicated Loop outlet: items run strictly sequentially. While re-evaluates under a hard max-iterations cap (default 20) and stamps _while_exceeded: true if the cap hits. The Run Script node deliberately does not expand {{variable}} templates inside the script body; payload values arrive as environment variables instead, specifically to prevent command injection from upstream data.
AI is pluggable behind a single provider interface: local Ollama and LM Studio, plus OpenAI, Anthropic, Google AI, OpenRouter, and any custom OpenAI-compatible endpoint. You choose the provider, including fully local options where nothing leaves your machine, and whatever you route into an AI node goes only to the provider you chose. The AI node can enforce structured output: give it a JSON Schema and the response is validated; on failure the node re-prompts the model with the specific validation error, up to three retries, and fails rather than passing malformed data downstream.
On a clean install the only pre-configured provider is local Ollama at localhost:11434; no cloud provider is wired up by default. The schema validator is a deliberately small subset of JSON Schema, about 70 lines, no external dependency; enforcement is prompt→validate→re-prompt, so it works identically against local Ollama and a hosted cloud API. When an AI node's output is itself a JSON object or array, it is auto-parsed into a navigable structure, so downstream nodes can reach fields with {{result.field}} or array positions with {{result.0}}; plain prose and bare scalars stay text. Beyond the single-prompt node there's an Agent node: an autonomous tool-using loop with a goal, a max-iteration cap, and an optional token budget. Its file tools go through the same path validator; its web fetch tool carries an SSRF guard.
The SSRF guard: only http/https schemes are allowed, and the localhost hostname (plus its IPv6 equivalents) and IP literals in loopback, private, or link-local ranges are refused before any network I/O. A custom redirect delegate re-validates every redirect target, so a public URL that 302s to a localhost literal is blocked mid-flight. This screens the host literally rather than resolving names, so it's one defense in depth and not a complete egress firewall; the local-only providers exist for anyone who'd rather an agent never reach the open network at all.
Neither the human nor the AI is taken on faith.
Watchflows is built by one developer (25+ years in software, a former CTO) directing Claude AI agents. The loop is deliberately simple: discuss the problem first, build it, test it (the full suite must be green before anything ships, and new behavior gets tests), then release. Substantial or risky changes go through a multi-agent pipeline: verify the root cause, implement test-first, then adversarial review by a separate agent whose job is to find what the first one missed. Small fixes skip the ceremony.
The multi-agent pipeline has a name in the repo ("ultracode"), and it's opt-in per change. The test command chains the Lambda tests, a site-link checker, and the full Swift suite, which runs with parallel testing disabled, a deliberate determinism-over-speed tradeoff. Past bugs are made non-regressable with negative-space grep gates: eleven shell scripts, most asserting that a pattern from a previously-fixed security issue never reappears. Six are wired into CI to fail the build if one does.
CI gates every pull request on a macOS runner. Releases are tag-triggered and fully automated: build, Developer ID codesign, Apple notarization, ticket stapling, EdDSA-sign the update, regenerate the appcast. The pipeline inspects the signed artifact's entitlements and refuses to publish if get-task-allow is present, a hard gate that a debug build can never ship as a release.