INDEX
Explanations
names of news organizations
references to a specific media outlet or journalist
New Auto-Interp
Negative Logits
egg
-0.76
meet
-0.75
ride
-0.73
makers
-0.73
tions
-0.72
swick
-0.71
aign
-0.71
tons
-0.71
maker
-0.70
stack
-0.69
POSITIVE LOGITS
Äĩ
1.19
udi
0.99
ère
0.97
zzo
0.95
ÃŁ
0.88
plom
0.87
qi
0.87
veyard
0.85
urnal
0.85
ennes
0.82
Activations Density 0.019%