INDEX
Explanations
adverbs modifying actions, often emphasizing manner, frequency, or intensity
New Auto-Interp
Negative Logits
al
-0.77
Adapt
-0.69
z
-0.66
an
-0.65
Amad
-0.65
ck
-0.63
us
-0.60
r
-0.60
Wulf
-0.60
hals
-0.59
POSITIVE LOGITS
']")
1.31
ently
1.30
sively
1.29
denly
1.28
handedly
1.22
)";
1.22
"]),
1.21
'):
1.20
quely
1.18
"])
1.17
Activations Density 0.642%