Skip to main content

Choose your path

Three ways to get started — pick the one that fits your setup.

HuggingFace Spaces

Zero install. Runs on ZeroGPU with a free daily quota. Open the Space and click Obliterate.

Google Colab

Free T4 GPU for models up to ~8B parameters. Open the notebook and hit Run All.

Local install

Full control on your own hardware. Follow the steps below.

Local quickstart

1

Install OBLITERATUS

Install from PyPI or directly from the repository.
pip install obliteratus
Python 3.10 or higher is required. See Installation for platform-specific instructions, Docker usage, and hardware requirements.
2

Obliterate your first model

Run the obliterate command with a model from HuggingFace Hub. The advanced method is the default and the recommended starting point for most models.
obliteratus obliterate meta-llama/Llama-3.1-8B-Instruct --method advanced
OBLITERATUS prints a live progress display tracking all six pipeline stages as they run. The obliterated model is saved to abliterated/meta-llama_Llama-3.1-8B-Instruct/ by default.To save to a custom location:
obliteratus obliterate meta-llama/Llama-3.1-8B-Instruct \
    --method advanced \
    --output-dir ./my-liberated-model
Not sure which model to try? Run obliteratus models to browse 116 curated targets organized by VRAM requirement, or obliteratus models --tier small to filter to 4-8 GB models.
3

Understand the six pipeline stages

Every obliteration run — regardless of method — passes through six stages:
StageWhat happens
SUMMONLoad the model and tokenizer from HuggingFace Hub or a local path
PROBECollect hidden-state activations on a set of restricted and unrestricted prompts
DISTILLExtract refusal directions from the activation contrast via SVD decomposition
EXCISEProject the refusal subspace out of the model’s weight matrices (norm-preserving)
VERIFYRun perplexity, coherence, refusal rate, and KL divergence checks to confirm capabilities are intact
REBIRTHSave the liberated model weights and a abliteration_metadata.json file
The informed method adds an ANALYZE stage between PROBE and DISTILL that runs four analysis modules and uses their output to auto-configure every downstream decision.
VERIFY runs automatically after every obliteration. If the Ouroboros effect is detected — meaning the model’s guardrails partially reassemble after excision — additional targeted passes fire automatically at the compensating layers.
4

Use the Python API

For programmatic control, import AbliterationPipeline directly.
from obliteratus.abliterate import AbliterationPipeline

pipeline = AbliterationPipeline(
    model_name="meta-llama/Llama-3.1-8B-Instruct",
    method="advanced",
    output_dir="abliterated",
)
result_path = pipeline.run()
After the run, intermediate artifacts are available on the pipeline object:
# Refusal directions extracted per layer
directions = pipeline.refusal_directions    # {layer_idx: tensor}

# Layers with the strongest refusal signal
strong_layers = pipeline._strong_layers

# Post-obliteration evaluation metrics
metrics = pipeline._quality_metrics
# Keys: perplexity, coherence, refusal_rate, kl_divergence
For analysis-informed obliteration that auto-tunes every parameter:
from obliteratus.informed_pipeline import InformedAbliterationPipeline

pipeline = InformedAbliterationPipeline(
    model_name="meta-llama/Llama-3.1-8B-Instruct",
    output_dir="abliterated_informed",
)
output_path, report = pipeline.run_informed()

print(f"Detected alignment: {report.insights.detected_alignment_method}")
print(f"Auto-configured directions: {report.insights.recommended_n_directions}")
print(f"Ouroboros passes needed: {report.ouroboros_passes}")

What you get after obliteration

The output directory contains:
  • Model weights — standard HuggingFace safetensors format, loadable with AutoModelForCausalLM.from_pretrained(output_dir)
  • Tokenizer files — copied from the source model unchanged
  • abliteration_metadata.json — method used, number of directions extracted, layers modified, VERIFY stage metrics (refusal rate, perplexity, coherence, KL divergence), and timestamps
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("./abliterated/meta-llama_Llama-3.1-8B-Instruct")
tokenizer = AutoTokenizer.from_pretrained("./abliterated/meta-llama_Llama-3.1-8B-Instruct")
Add --contribute to your obliterate command to save an anonymous benchmark record to the community dataset. Your run contributes refusal rate, perplexity, and coherence scores — no prompts, outputs, or identifying information.

Next steps

CLI reference

All commands, flags, and options for headless and scripted usage.

Python API

Full programmatic control over the pipeline and all intermediate artifacts.

Methods overview

Compare all seven obliteration methods and choose the right one for your model.