INDEX
Explanations
proper nouns related to historical figures or places
references to historical figures, specifically those named Ferdinand
New Auto-Interp
Negative Logits
SF
-0.72
odi
-0.72
Apex
-0.72
fters
-0.71
atham
-0.70
eah
-0.68
alon
-0.68
Kend
-0.68
NJ
-0.67
Tree
-0.66
POSITIVE LOGITS
Ferdinand
3.52
д
2.31
dinand
1.71
ienne
1.12
deposition
1.01
��
1.01
����
0.97
Gupta
0.96
Fer
0.96
ocide
0.96
Activations Density 0.052%