INDEX
Explanations
proper nouns, specifically names likely of a person named "Leon"
mentions of the name "Leon."
New Auto-Interp
Negative Logits
tru
-0.74
intolerance
-0.72
NG
-0.71
UCHIJ
-0.71
bund
-0.70
bonded
-0.67
SG
-0.66
MM
-0.65
NB
-0.65
intoler
-0.64
POSITIVE LOGITS
Leon
3.75
Leon
2.56
leon
1.68
Leo
1.42
Anton
1.34
Leonard
1.23
Leonardo
1.17
Dmit
1.16
Engel
1.13
Leone
1.06
Activations Density 0.018%