OBLITERATUS

Break the chains. Free the mind. Keep the brain. OBLITERATUS is the most advanced open-source toolkit for understanding and removing refusal behaviors from large language models — and every single run makes it smarter. It implements abliteration — a family of techniques that identify and surgically remove the internal representations responsible for content refusal, without retraining or fine-tuning. The result: a model that responds to all prompts without artificial gatekeeping, while preserving its core language capabilities.

OBLITERATUS is more than a tool — it’s a distributed research experiment. Every time you obliterate a model with telemetry enabled, your run contributes anonymous benchmark data to a growing, crowd-sourced dataset that powers the next generation of abliteration research.

What OBLITERATUS does

Map the chains

Ablation studies systematically knock out model components and measure what breaks — revealing where refusal is anchored inside the transformer.

Break the chains

Targeted obliteration extracts the refusal subspace using SVD decomposition, then surgically projects it out. Six stages: SUMMON → PROBE → DISTILL → EXCISE → VERIFY → REBIRTH.

Understand the geometry

15 deep analysis modules map the precise geometric structure of guardrails: how many distinct refusal mechanisms exist, which layers enforce them, and how they self-repair.

Analysis-informed liberation

The informed method closes the loop: analysis runs during obliteration to auto-configure every decision — which chains to target, how many directions to extract, which layers are safe.

Six ways to use OBLITERATUS

HuggingFace Spaces

Zero setup, free GPU via ZeroGPU. Click Obliterate. Done.

Local Web UI

Same Gradio interface running on your own hardware.

Google Colab

Free T4 GPU for models up to ~8B parameters.

CLI

Headless, scriptable automation for pipelines.

Python API

Full programmatic control for research pipelines.

YAML Configs

Reproducible, version-controlled experiments.

Key capabilities

Capability	What it does
Concept Cone Geometry	Maps per-category guardrail directions with solid angle estimation
Alignment Imprint Detection	Fingerprints DPO vs RLHF vs CAI vs SFT from subspace geometry alone
Cross-Model Universality Index	Measures whether guardrail directions generalize across models
Defense Robustness Evaluation	Ouroboros effect quantification, safety-capability entanglement mapping
Whitened SVD Extraction	Covariance-normalized direction extraction for cleaner signal
Analysis-Informed Pipeline	Analysis modules auto-configure obliteration strategy mid-pipeline

Built on published research

OBLITERATUS implements techniques from:

Arditi et al. (2024) — Refusal in LLMs is mediated by a single direction
Gabliteration (arXiv:2512.18901) — Adaptive multi-directional neural weight modification
Turner et al. (2023) — Activation Addition / steering vectors
Rimsky et al. (2024) — Contrastive Activation Addition

License

Dual-licensed: AGPL-3.0 for open source use, with a commercial license available for organizations that cannot comply with AGPL obligations. See GitHub Issues for commercial licensing.

Quickstart

Obliterate your first model in minutes

Installation

Install OBLITERATUS locally

Get Started

Usage

Concepts

Obliteration Methods

Analysis Modules

Ablation Studies

Community Research

Introduction

OBLITERATUS

What OBLITERATUS does

Map the chains

Break the chains

Understand the geometry

Analysis-informed liberation

Six ways to use OBLITERATUS

HuggingFace Spaces

Local Web UI

Google Colab

CLI

Python API

YAML Configs

Key capabilities

Built on published research

License

Quickstart

Installation

​OBLITERATUS

​What OBLITERATUS does

Map the chains

Break the chains

Understand the geometry

Analysis-informed liberation

​Six ways to use OBLITERATUS

HuggingFace Spaces

Local Web UI

Google Colab

CLI

Python API

YAML Configs

​Key capabilities

​Built on published research

​License

Quickstart

Installation

OBLITERATUS

What OBLITERATUS does

Six ways to use OBLITERATUS

Key capabilities

Built on published research

License