INDEX
Explanations
phrases indicating smallness or modesty
New Auto-Interp
Negative Logits
acker
-0.17
_UNUSED
-0.15
stå
-0.15
åľ¨
-0.15
InParameter
-0.14
ÃĹ↵↵
-0.14
zure
-0.14
zeit
-0.14
sourceMapping
-0.13
awaiter
-0.13
POSITIVE LOGITS
manner
0.48
nutshell
0.40
way
0.38
hurry
0.35
effort
0.31
bid
0.29
fashion
0.27
attempt
0.24
matter
0.24
heartbeat
0.24
Activations Density 0.130%