INDEX
Explanations
names of people, especially historical figures
New Auto-Interp
Negative Logits
comed
-1.06
umpy
-1.05
intern
-1.00
comings
-0.99
icky
-0.97
REL
-0.95
risome
-0.95
efficients
-0.93
EEP
-0.93
aryl
-0.92
POSITIVE LOGITS
III
1.06
sson
1.05
Roberts
1.04
Hubbard
1.02
Hes
1.00
Rodrig
1.00
Militia
0.99
ovich
0.97
Tenth
0.94
Gib
0.94
Activations Density 1.655%