INDEX
Explanations
references to various presidents
New Auto-Interp
Negative Logits
ëį°
-0.16
rous
-0.15
ites
-0.15
weise
-0.15
tras
-0.14
nes
-0.14
sobie
-0.14
adaki
-0.14
æľ¬
-0.14
tron
-0.14
POSITIVE LOGITS
aux
0.16
ially
0.15
larg
0.15
elijke
0.15
zimmer
0.15
ships
0.15
rick
0.15
republiky
0.15
ONGL
0.15
-relative
0.14
Activations Density 0.043%