INDEX
Explanations
adverbs that indicate how an action is carried out
adverbs indicating manner or frequency
New Auto-Interp
Negative Logits
rite
-0.68
tein
-0.67
AJ
-0.66
oute
-0.63
rys
-0.63
ACL
-0.60
hao
-0.59
TY
-0.57
LH
-0.57
privilege
-0.57
POSITIVE LOGITS
afterward
0.77
thereafter
0.77
during
0.75
throughout
0.74
afterwards
0.74
ILCS
0.74
clothed
0.72
outdoors
0.71
onstage
0.70
indoors
0.69
Activations Density 0.149%