Glossary 194 terms across 14 groups. Every definition is one sentence; every entry links to an authoritative free reference (Wikipedia, MDN, OWASP, MITRE CWE, RFCs, vendor spec docs). When PreFlight has its own page on the topic, the entry also links there.
Curated, not authored. We pick the terms that matter and link out for the depth.
Web application security Cryptography and identity Supply chain AI and LLM security OWASP categories CS fundamentals Software architecture Distributed systems Observability and operations Networking and HTTP Web platform Build and tooling Accessibility (A11y) PreFlight vocabulary Web application security The vocabulary that shows up on most finding cards. If you read one section in order, this is the one.
XSS (Cross-Site Scripting) Attacker-controlled HTML or JavaScript executed in a victim's browser via a vulnerable page.
CSRF (Cross-Site Request Forgery) A request the user did not intend, sent from a malicious site to a service the user is logged into.
SSRF (Server-Side Request Forgery) A server tricked into making an outbound HTTP request to a destination the attacker controls or chooses.
User input concatenated into a SQL string at parse time, letting attackers rewrite the query.
A user-controlled path that lets the attacker read or write files outside the intended directory.
IDOR (Insecure Direct Object Reference) Authorization missing on a per-row basis; user A can request user B's record by changing an ID.
RCE (Remote Code Execution) An attacker runs arbitrary code on the target system without prior authorization.
An endpoint that redirects to a URL controlled by the user, letting attackers borrow the domain's trust.
CORS (Cross-Origin Resource Sharing) The browser policy that controls which origins can read responses from a given API.
CSP (Content Security Policy) A response header that whitelists which sources a page can load scripts, styles, images, and connections from.
HSTS (HTTP Strict Transport Security) A header that tells browsers to use HTTPS for the domain for a given duration, defeating downgrade attacks.
A header that prevents a page from being embedded in an iframe (clickjacking defense).
Subresource Integrity (SRI) A hash attribute on a script or stylesheet tag that lets the browser refuse altered content.
A cookie attribute that prevents JavaScript from reading the cookie value, mitigating XSS-to-session theft.
A cookie attribute that controls whether the cookie is sent on cross-site requests (CSRF defense).
A signed-or-encrypted token format with a JSON payload, commonly used for auth state.
The standard authorization framework that lets a user grant a third-party app limited access to their account.
An identity layer on top of OAuth 2.0 that lets clients verify who the user is.
TLS where both the client and the server present certificates, used for service-to-service auth.
MFA (Multi-Factor Authentication) Requiring two or more independent credentials (password, device, biometric) for a single login.
An XML-based standard for enterprise single sign-on, predates and overlaps with OAuth/OIDC.
The browser rule that prevents scripts on one origin from reading data on another.
An attack where a page opened via target="_blank" rewrites the parent tab to a phishing site.
Tricking a user into clicking something different from what they perceive, often via an invisible iframe.
A code-execution attack that uses Unicode bidi control characters to make source render differently than it compiles.
Cryptography and identity The primitives behind every authentication, signature, and secret. Most findings touch one of these.
AES (Advanced Encryption Standard) The default symmetric cipher for most data-at-rest and TLS workloads.
A widely-used asymmetric cipher; keys are typically 2048 or 4096 bits.
Elliptic Curve Digital Signature Algorithm. Faster and smaller-key than RSA for the same security.
A modern elliptic-curve signature scheme. Recommended for new signing keys.
A message authentication code built on a hash function. Used for signing webhooks and JWTs.
A 256-bit cryptographic hash function in the SHA-2 family.
Cryptographically secure pseudo-random number generator. Math.random is NOT one.
Random bytes added to a password before hashing so the same password produces different hashes.
A number used once. Random or sequential, used to prevent replay attacks.
The modern recommended password-hashing function. Tunable for memory + time costs.
A long-standing password-hashing function. Still acceptable; Argon2 is the modern preference.
HSM (Hardware Security Module) A dedicated device for storing keys and performing crypto operations. Keys never leave the HSM.
A public log of every issued TLS certificate. Lets domain owners detect misissued certs.
Supply chain The vocabulary of the 2025-2026 npm worm waves and the broader package-trust problem.
A package registered with a name one character off from a popular one, hoping for install typos.
A package registered to match a name that LLMs commonly hallucinate, harvesting installs from vibe-coded projects.
An attack that abuses package managers preferring public-registry versions over private ones with the same name.
SBOM (Software Bill of Materials) A formal record of every dependency in a build. Used for vulnerability scanning and audit.
Supply-chain Levels for Software Artifacts. A framework for build-pipeline integrity.
An npm lifecycle hook that runs arbitrary code at install time. The execution surface most worms abuse.
An npm config flag that disables lifecycle scripts. Recommended for CI runners.
An npm config that refuses to install package versions younger than the given age. Defeats fast-moving worms.
A version-pinned manifest (package-lock.json, yarn.lock, pnpm-lock.yaml) that fixes the dependency tree.
The September 2025 npm worm wave. Postinstall scripts stole maintainer tokens and republished poisoned packages.
The 2026 successor wave. April SAP / Bitwarden waves and the May 11 TanStack wave by TeamPCP.
Indicator of Compromise (IOC) A signature, hash, file path, or string that identifies a known attack on a host.
AI and LLM security OWASP LLM Top 10 vocabulary plus the AI-tooling specifics PreFlight scans for.
LLM (Large Language Model) A transformer-based model trained on text to predict the next token. The thing your AI tool is.
User input crafted to override the system prompt and make the model follow attacker instructions.
Indirect prompt injection Prompt injection delivered via content the model reads later (a document, a page, a tool output).
RAG (Retrieval-Augmented Generation) A pattern where the model is given chunks retrieved from a vector store as context for the query.
A vector representation of text that captures semantic meaning. Used for similarity search.
A database optimized for high-dimensional nearest-neighbor search over embeddings.
The maximum number of tokens (input + output) a model can attend to in one request.
The fixed instruction text given to a model alongside the user prompt. Defines the assistant's persona.
The randomness knob on model sampling. 0 is deterministic; higher values increase variety.
The step that splits text into subword units the model actually processes. Cost is per token.
A model adapting to a task from examples in the prompt, without weight updates.
Continuing the training of a pretrained model on task-specific data to adjust its weights.
RLHF (Reinforcement Learning from Human Feedback) A training method where humans rank model outputs and the model learns to prefer the higher-ranked ones.
MCP (Model Context Protocol) The standard for connecting AI assistants to local and remote tools. The interop layer for agents.
A structured prompt spec that defines an agent's role, skills, voice, and refusals. PreFlight ships four.
A prompt-injection variant that bypasses an LLM's safety training to elicit refused outputs.
Model output that sounds plausible but is factually wrong or refers to things that do not exist.
OWASP categories The Top 10 (2025 edition) and LLM Top 10 (2025 edition), one-line each. Full mapping at /learn/owasp.
A01: Broken Access Control Authorization missing or wrong. The most prevalent application security risk.
A02: Cryptographic Failures Secrets or sensitive data exposed via weak crypto, insecure storage, or insecure transport.
User input executed as code or query. SQL injection, command injection, NoSQL injection, LDAP injection.
A system designed in a way that produces vulnerable shapes regardless of how carefully each line is written.
A05: Security Misconfiguration Defaults left in production, security headers absent, dev surfaces exposed.
A06: Vulnerable and Outdated Components Known-vulnerable dependencies still in production. Supply-chain compromises live here.
A07: Authentication Failures Weak password policy, missing MFA, session handling errors, credential stuffing tolerance.
A08: Software and Data Integrity Failures Trust placed in components or supply-chain artifacts that have not been verified.
A09: Security Logging Failures Security events that happen without leaving a log. The blind spot in every incident response.
Server-Side Request Forgery. Server fetches a URL the client supplies.
User input that overrides the system prompt or smuggles instructions through retrieved context.
LLM02: Sensitive Information Disclosure Model output that reveals data the calling user should not see (system prompt, other tenants, training data).
LLM04: Data and Model Poisoning Attacker-controlled content injected into training data or RAG ingestion pipelines.
An agent with tool capabilities beyond what the task requires. Big blast radius on injection.
LLM07: System Prompt Leakage System prompts surfaced via error messages, debug output, or model-coaxed disclosure.
LLM08: Vector and Embedding Weaknesses Cross-tenant retrieval leakage, embedding-cache poisoning, query-time scope filter bypass.
CS fundamentals Vocabulary every developer should be able to define. If you came from non-CS background and want to fill in the gaps, see also the Resources page CS section.
A finite sequence of well-defined steps for solving a problem or computing a value.
A way of organizing data so the operations you care about are efficient.
Notation for how an algorithm's resource use grows as the input gets bigger.
A data structure with average O(1) lookup, insert, and delete via a hash function on keys.
A tree data structure where each node has at most two children. Foundation for many search structures.
A set of nodes (vertices) and edges. Models networks, dependencies, relationships.
A function calling itself, with a base case that stops the recursion.
An algorithm technique that solves problems by combining solutions to overlapping subproblems.
An algorithm that makes the locally-optimal choice at each step, hoping for a global optimum.
Solving a problem by recursively breaking it into smaller subproblems of the same type.
OOP (Object-Oriented Programming) A paradigm where state and behavior are bundled into objects with class-defined methods.
A paradigm where functions are first-class, state is immutable, and side effects are isolated.
A function whose output depends only on its arguments and that has no side effects.
A function bundled with its enclosing scope, so the function can reference variables from outside.
Mutability vs immutability Whether a value can be changed after creation. Immutable values are safer to share across code.
Hiding implementation details behind an interface. The thing that lets us reason about big systems.
Bundling state and the operations that act on it, hiding state from outside access.
One name (function, operator, interface) usable across different types.
The rules a language uses to classify values and check that operations are valid for them.
Automatic memory management. The runtime reclaims memory no longer reachable.
Software architecture Patterns that show up once a system has more than one moving part. The Resources page links to free reading per term.
A single deployable unit that contains all the application code. Often a sensible starting point.
A service responsible for one bounded capability, deployed independently. Comes with operational cost.
Service-Oriented Architecture (SOA) The broader pattern of decomposing a system into services that communicate over a network.
Event-driven architecture Components communicate by producing and consuming events on a queue or stream.
CQRS (Command Query Responsibility Segregation) Separating write paths (commands) from read paths (queries), often with different data models.
Persisting the events that produced state rather than the state itself. Replay rebuilds the state.
A property where applying the same operation multiple times has the same effect as applying it once.
A distributed system can guarantee at most two of: consistency, availability, partition tolerance.
A consistency model where replicas converge over time, after a write quiet period.
Atomicity, Consistency, Isolation, Durability. The transactional guarantees of traditional RDBMS.
Basically Available, Soft state, Eventually consistent. The trade-off many NoSQL systems make.
Model-View-Controller. A pattern that separates data, presentation, and input handling.
A pattern that separates business logic from inputs/outputs via ports and adapters.
Layered architecture with dependencies pointing toward business rules, not away from them.
Providing a component's dependencies from outside rather than letting it instantiate them.
Domain-Driven Design (DDD) Designing software around the language and structure of the problem domain.
A single entry point that routes requests to backend services. Adds auth, rate limits, observability.
Deploying a helper process alongside an application to handle cross-cutting concerns.
Distributed systems The vocabulary of systems where more than one process / machine is involved.
Keeping multiple copies of data on different nodes for availability and read scale.
Partitioning data across nodes so each node owns a subset.
A consensus protocol where nodes agree on which one is currently the primary.
A distributed agreement on a single value across multiple nodes, despite failures.
A consensus algorithm designed to be more understandable than Paxos. Used by etcd, Consul.
The classic distributed-consensus algorithm. Conceptually elegant, notoriously hard to implement.
A protocol for distributed transactions. Coordinator asks all participants to prepare, then commit.
Each message is delivered exactly once. In practice, you get at-least-once + idempotency instead.
Each message is delivered one or more times. Pair with idempotent consumers.
A pattern that stops calling a failing dependency for a cooldown period after consecutive failures.
A retry strategy where the wait between attempts grows exponentially, often with jitter.
A signal from a slow consumer to a fast producer to slow down, preventing queue buildup.
Observability and operations What you need after the code ships. The discipline of running production.
Emitting structured records of events for later inspection. PreFlight's Security Logging probe scans for this.
Numerical measurements over time. Counters, gauges, histograms.
Following a single request through every service it touches. Built on spans and trace IDs.
The cross-vendor standard for observability data (traces, metrics, logs).
SLI (Service Level Indicator) A measurable signal about service health (latency, error rate, availability).
SLO (Service Level Objective) A target value for an SLI over a time window. "99.9% of requests under 200ms over 30 days."
SLA (Service Level Agreement) A contractual commitment about an SLO, with consequences (refunds, credits) for missing it.
The amount of allowed unreliability remaining before you hit the SLO ceiling.
Service health summarized by Rate, Errors, and Duration. Three metrics per endpoint.
Resource health summarized by Utilization, Saturation, Errors.
Networking and HTTP The protocols underneath every API call.
The application-layer protocol for the web. Request/response. Stateless by default.
HTTP over TLS. Encrypted in transit, server identity verified via certificate.
Transport Layer Security. The successor to SSL. Provides encryption, integrity, identity.
Transport Control Protocol. Connection-oriented, ordered, reliable delivery.
User Datagram Protocol. Connectionless, unordered, no delivery guarantees. Lower overhead.
Domain Name System. Translates human-readable names to IP addresses.
CDN (Content Delivery Network) A network of edge servers that cache content close to users for lower latency.
A server that sits in front of one or more origins, terminating TLS, routing, caching.
A persistent bidirectional connection over HTTP-upgraded TCP. Real-time without polling.
A unidirectional streaming protocol. Server pushes events to the client over HTTP.
A high-performance RPC framework over HTTP/2 with Protocol Buffers as the wire format.
An architectural style for web APIs based on resources, HTTP verbs, and statelessness.
A query language for APIs. Clients describe the shape of the response they want.
Accessibility (A11y) How to make sure the 15-20% of users who need accessibility considerations can actually use the site.
Web Content Accessibility Guidelines. The authoritative accessibility standard.
Accessible Rich Internet Applications. Attributes that expose UI semantics to assistive tech.
Assistive technology that converts on-screen content into speech or braille.
A semantic HTML element (header, nav, main, footer, aside) screen readers use for navigation.
Controlling which element receives keyboard input. Critical for users who can't use a mouse.
A hidden-until-focused link that lets keyboard users skip past nav directly to main content.
The minimum size of interactive targets. 24×24 CSS px minimum, 44×44 AAA.
The luminance ratio between text and background. 4.5:1 minimum for normal text.
PreFlight vocabulary Terms that exist inside PreFlight: probes, personas, the manifesto, the safety contracts.
A pure function that scans file content and returns findings. PreFlight has 43 of them.
One detected issue: probe name, severity, category, CWE, file:line, evidence, remediation, OWASP code.
An FNV-1a hash of probe + file + title + ±3-line context. Survives line shifts and reformats.
Marking a finding as false-positive, wont-fix, or accepted-risk. Keyed on stable ID.
BYOK (Bring Your Own Key) PreFlight's pattern for AI features: you supply the API key, requests go directly to your provider.
BYOT (Bring Your Own Token) The same pattern for private GitHub repo scanning: you supply a PAT.
Software built primarily through natural-language prompts to an AI tool. PreFlight's audience.
The stance PreFlight takes: capable practitioners, mechanics-instructor register, no preaching.
The Persona+ spec for security fix generation. Dual-mode: SAM_COMMAND_FULL + SAM_COMMAND_SNIPPET.
The Persona+ spec for educational content authoring + grading. Dual-mode: AUTHOR + GRADE.
The Persona+ spec for design-rules enforcement (planned v1.1).
The Persona+ spec for engineering-rules enforcement (planned v1.1).
Per-finding adversarial inputs that demonstrate the attack. Static-only, no execution. v1 on feature/breakers-v1.
PreFlight's founding principle: the tool has to pass its own audit on every build, or the build fails.
Source-of-truth: src/lib/glossary.js. Spotted a term that should be here, or a definition that’s off? Open a PR at github.com/midatlanticAI/PreFlight .