INDEX
Explanations
negative sentiments toward political figures and their actions
New Auto-Interp
Negative Logits
EVEN
-0.51
INCLUDING
-0.49
hi
-0.48
snippetHide
-0.48
even
-0.47
albeit
-0.47
Rasp
-0.46
pang
-0.46
WOR
-0.46
@[+][
-0.45
POSITIVE LOGITS
iprot
0.65
Rüyada
0.65
WithIOException
0.63
Whatever
0.61
kasarigan
0.61
Whatever
0.60
חיצוניים
0.60
termica
0.57
Indep
0.57
whatever
0.55
Activations Density 0.040%