INDEX
Explanations
expressions of intensity or extremity related to effort, dedication, or persistence
New Auto-Interp
Negative Logits
ÎķÎĽ
-0.15
.mdl
-0.15
วล
-0.15
osu
-0.15
arella
-0.14
.LENGTH
-0.14
ALLED
-0.14
reau
-0.14
kuk
-0.14
indr
-0.14
POSITIVE LOGITS
ingly
0.25
ably
0.23
/un
0.21
nes
0.20
ly
0.19
amounts
0.18
baar
0.18
upo
0.16
ously
0.15
urge
0.15
Activations Density 0.089%