INDEX
Explanations
mentions of individuals or groups involved in decision-making or leadership roles
New Auto-Interp
Negative Logits
ehr
-0.15
apid
-0.15
erti
-0.14
IDEOS
-0.14
illard
-0.14
aska
-0.13
vil
-0.13
FRING
-0.13
Ã¥l
-0.13
adir
-0.13
POSITIVE LOGITS
think
0.65
believe
0.60
thinks
0.57
think
0.55
believes
0.55
feel
0.52
Think
0.52
认为
0.51
Think
0.50
THINK
0.48
Activations Density 0.863%