INDEX

Explanations

modifiable, learnable, applicable

This neuron flags anomalous or out‐of‐distribution tokens—especially stray dashes or otherwise unusually formatted/rare tokens—that stand out from normal text.

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

čkom

0.83

 risk

0.82

 Haber

0.80

る

0.79

،

0.79

 counterclaim

0.77

 gefallen

0.77

Viva

0.76

 ڈپاز

0.75

 timeframe

0.75

POSITIVE LOGITS

ּ

0.78

 полови

0.75

 aussitôt

0.73

cil

0.71

 itertools

0.70

 सहयोगी

0.70

 आर

0.70

 difíc

0.68

嗓

0.68

 đôi

0.66

Activations Density 0.000%