INDEX
Explanations
mentions of organizations
repeated references to various organizations
New Auto-Interp
Negative Logits
dos
-0.71
ths
-0.70
Tro
-0.68
kowski
-0.66
ixed
-0.66
chie
-0.65
gets
-0.65
req
-0.64
aults
-0.64
stroke
-0.63
POSITIVE LOGITS
organization
1.01
eers
1.00
affili
0.93
organisation
0.85
organizations
0.83
eering
0.78
ality
0.78
bom
0.78
ally
0.77
arily
0.77
Activations Density 0.022%