INDEX
Explanations
names of political figures from various parties and contexts
New Auto-Interp
Negative Logits
Compare
-0.74
yz
-0.65
Lauder
-0.61
..................
-0.60
Compare
-0.57
Rothschild
-0.57
EVA
-0.56
PN
-0.55
LIN
-0.55
Pwr
-0.55
POSITIVE LOGITS
ELF
1.08
ullivan
1.05
own
1.04
newest
0.92
selves
0.89
favourite
0.87
favorite
0.85
pecially
0.83
kaya
0.83
lightly
0.82
Activations Density 0.949%