INDEX
Explanations
adverbs indicating intensity or degree
New Auto-Interp
Negative Logits
understatement
-0.73
orial
-0.64
coming
-0.64
aptic
-0.62
itant
-0.60
Borough
-0.58
pex
-0.58
Bark
-0.57
opausal
-0.57
Issue
-0.56
POSITIVE LOGITS
differently
1.05
smoothly
0.90
overboard
0.87
into
0.86
badly
0.83
nicely
0.81
slack
0.81
closely
0.80
poorly
0.80
spor
0.79
Activations Density 0.102%