INDEX
Explanations
specific dates and names from news articles or reports
citations and publication details in texts
New Auto-Interp
Negative Logits
unic
-0.72
refunds
-0.70
torped
-0.66
rightful
-0.62
susp
-0.62
Vall
-0.61
trope
-0.60
radius
-0.59
discriminate
-0.58
closet
-0.58
POSITIVE LOGITS
BBC
1.40
Editor
1.06
Narr
1.05
Guest
1.03
Contribut
0.98
Associated
0.94
EVA
0.93
Edited
0.93
Bloomberg
0.92
HAEL
0.89
Activations Density 0.073%