INDEX
Explanations
phrases indicating size or magnitude
New Auto-Interp
Negative Logits
oot
-0.17
Poss
-0.16
à¥įह
-0.15
itol
-0.14
Wonder
-0.14
soever
-0.14
mc
-0.14
pliers
-0.14
mh
-0.14
storm
-0.13
POSITIVE LOGITS
pike
0.18
ëļ
0.16
leine
0.16
strup
0.15
precated
0.15
_TUN
0.14
ogy
0.14
(chan
0.14
çĽijåIJ¬é¡µéĿ¢
0.13
apiro
0.13
Activations Density 0.030%