INDEX
Explanations
references to specific news organizations and individuals
mentions of specific organizations or entities, particularly in a news context
New Auto-Interp
Negative Logits
pering
-0.90
acters
-0.83
phas
-0.81
uring
-0.81
obi
-0.81
akable
-0.79
oller
-0.77
tha
-0.76
auga
-0.75
opher
-0.75
POSITIVE LOGITS
ND
0.84
Seym
0.78
Unicorn
0.67
ND
0.65
Airport
0.62
ACP
0.62
Dil
0.61
clearance
0.60
Editors
0.59
Fein
0.59
Activations Density 0.062%