INDEX
Explanations
adverbs that convey varying degrees of intensity or manner
New Auto-Interp
Negative Logits
Adapt
-0.56
ند
-0.55
-
-0.55
al
-0.53
an
-0.53
limits
-0.53
us
-0.52
!
-0.52
marks
-0.52
ge
-0.51
POSITIVE LOGITS
xically
1.44
weirdly
1.24
curiously
1.20
sively
1.18
ently
1.17
interestingly
1.16
iously
1.15
powerfully
1.14
ctively
1.13
tably
1.13
Activations Density 0.343%