INDEX
Explanations
references to individuals in temporary or role-specific leadership positions
mentions of the word "acting" and its variations in various contexts
New Auto-Interp
Negative Logits
xual
-0.83
ilings
-0.69
Blackwell
-0.66
joy
-0.65
nda
-0.64
front
-0.64
Sovere
-0.64
ciating
-0.63
fore
-0.61
fac
-0.61
POSITIVE LOGITS
uary
0.97
uate
0.84
iott
0.80
inic
0.79
uated
0.78
uations
0.78
icter
0.75
ional
0.75
ives
0.72
uka
0.69
Activations Density 0.016%