INDEX
Explanations
positive and negative news statements
New Auto-Interp
Negative Logits
Downing
-0.15
after
-0.15
Ble
-0.14
коз
-0.14
if
-0.14
Nations
-0.14
force
-0.14
see
-0.13
anas
-0.13
unless
-0.13
POSITIVE LOGITS
ucc
0.16
oldur
0.16
urname
0.15
agli
0.15
erule
0.15
Narrated
0.15
åįĵ
0.15
owy
0.14
.uf
0.14
owe
0.14
Activations Density 0.015%