INDEX
Explanations
names of individuals and entities in news articles
specific names and quantities related to people, locations, and conditions
New Auto-Interp
Negative Logits
',"
-0.67
uador
-0.67
apest
-0.67
anwhile
-0.67
qqa
-0.63
ESA
-0.62
cause
-0.60
'.
-0.60
hedon
-0.59
"]=>
-0.59
POSITIVE LOGITS
,
0.94
*,
0.93
,,
0.91
?,
0.83
,[
0.81
!,
0.79
®,
0.78
,...
0.72
,
0.71
,)
0.65
Activations Density 0.811%