INDEX
Explanations
references to historical figures, particularly Napoleon and related events
New Auto-Interp
Negative Logits
ntax
-0.17
mach
-0.15
Mui
-0.15
mach
-0.15
gger
-0.15
tg
-0.14
oker
-0.14
/os
-0.14
leck
-0.14
cheid
-0.14
POSITIVE LOGITS
Napoleon
0.37
Nap
0.34
Bon
0.26
nap
0.25
Waterloo
0.25
Bon
0.23
181
0.23
Wellington
0.20
ÙĨاب
0.19
180
0.18
Activations Density 0.026%