INDEX
Explanations
phrases related to detachment, escaping, and walking away from situations
New Auto-Interp
Negative Logits
umbers
-0.76
ounce
-0.67
enegger
-0.61
ãĥ¥
-0.58
immer
-0.57
iop
-0.57
cycl
-0.56
tone
-0.56
amar
-0.56
wave
-0.55
POSITIVE LOGITS
from
1.13
FROM
1.06
from
1.01
From
0.96
From
0.94
away
0.82
ired
0.78
owship
0.75
heid
0.72
iable
0.72
Activations Density 2.093%