INDEX
Explanations
specific political figures and their actions or statements
New Auto-Interp
Negative Logits
arov
-0.21
aday
-0.17
caler
-0.16
COPE
-0.16
Ń
-0.16
dff
-0.15
arent
-0.15
neh
-0.14
иÑģÑĮ
-0.14
anna
-0.14
POSITIVE LOGITS
igor
0.15
reta
0.15
Чи
0.14
671
0.14
oric
0.14
ILT
0.14
RenderWindow
0.14
igma
0.14
rosso
0.14
JNIEnv
0.14
Activations Density 0.047%