INDEX
Explanations
phrases related to professional affiliations and career transitions
phrases that reference specific teams or organizations
New Auto-Interp
Negative Logits
NetMessage
-0.75
rats
-0.73
selection
-0.72
agree
-0.71
rad
-0.69
Figure
-0.69
perse
-0.68
assed
-0.68
hari
-0.67
headers
-0.67
POSITIVE LOGITS
same
1.02
aforementioned
0.97
Clintons
0.96
prestigious
0.95
likes
0.95
Department
0.88
United
0.88
highest
0.86
latter
0.84
infamous
0.83
Activations Density 0.278%