INDEX
Explanations
phrases related to the concept of staying or remaining in a particular state or location
New Auto-Interp
Negative Logits
/from
-0.18
vale
-0.17
antine
-0.16
äº
-0.16
erm
-0.15
nap
-0.15
ured
-0.15
rypted
-0.14
mente
-0.14
esc
-0.14
POSITIVE LOGITS
cation
0.23
ders
0.19
away
0.18
true
0.18
true
0.17
tuned
0.16
alive
0.16
-away
0.16
подалÑĮ
0.16
_alive
0.15
Activations Density 0.030%