INDEX
Explanations
news outlets and platforms
references to prominent news publications
New Auto-Interp
Negative Logits
range
-0.79
interstitial
-0.74
$.
-0.70
gone
-0.64
grounding
-0.63
0000000000000000
-0.63
pant
-0.61
halla
-0.60
Confederacy
-0.58
sed
-0.58
POSITIVE LOGITS
understands
1.05
correspondent
0.97
reports
0.96
investigates
0.96
columnist
0.94
reporter
0.94
reported
0.93
publishes
0.92
Editorial
0.90
ran
0.88
Activations Density 0.239%