INDEX
Explanations
references to leadership and authority figures
leader and leadership
New Auto-Interp
Negative Logits
$
-0.35
vuông
-0.35
Activités
-0.30
window
-0.30
fancy
-0.30
spesial
-0.30
Japanese
-0.30
curiosity
-0.29
packung
-0.29
мона
-0.29
POSITIVE LOGITS
leader
0.82
leader
0.75
leaders
0.73
leaders
0.73
Leaders
0.73
pemimpin
0.69
таратура
0.69
LookAnd
0.64
líderes
0.64
parsedMessage
0.64
Activations Density 0.012%