INDEX
Explanations
specific articles or stories, possibly from news outlets
references to news articles and stories
New Auto-Interp
Negative Logits
ibles
-0.64
joystick
-0.63
imus
-0.62
practise
-0.62
natives
-0.61
emonic
-0.60
dues
-0.59
fulfil
-0.59
arers
-0.57
ucer
-0.56
POSITIVE LOGITS
headlined
1.40
titled
1.11
alleging
1.07
reporting
1.05
headline
0.98
detailing
0.97
published
0.97
article
0.96
article
0.95
exposing
0.93
Activations Density 0.131%