INDEX
Explanations
names or proper nouns related to various individuals
proper nouns and specific names related to individuals and concepts
New Auto-Interp
Negative Logits
owing
-0.76
cano
-0.74
cess
-0.73
ame
-0.72
erton
-0.72
enium
-0.69
anted
-0.68
ames
-0.67
ington
-0.66
ridge
-0.66
POSITIVE LOGITS
Beir
0.85
SHIP
0.84
Zah
0.81
Kahn
0.80
Hollande
0.75
++++++++++++++++
0.75
Macron
0.73
Fine
0.71
bars
0.70
Akin
0.70
Activations Density 0.040%