INDEX
Explanations
phrases indicating the perception of support or discontent
New Auto-Interp
Negative Logits
agate
-0.84
ramids
-0.73
ocr
-0.71
FORMATION
-0.68
querque
-0.68
Accessed
-0.67
verage
-0.66
totaling
-0.62
record
-0.62
killing
-0.62
POSITIVE LOGITS
onlook
0.99
observers
0.96
economists
0.94
commenters
0.94
reviewers
0.92
physicists
0.92
pundits
0.89
commentators
0.88
Krugman
0.88
locals
0.88
Activations Density 0.160%