INDEX
Explanations
phrases related to progression towards goals and aspirations
New Auto-Interp
Negative Logits
unos
-0.15
prise
-0.15
ymax
-0.15
_startup
-0.14
ieu
-0.14
urger
-0.14
panse
-0.14
оÑģÑĤаÑĤ
-0.13
ott
-0.13
intervention
-0.13
POSITIVE LOGITS
towards
0.50
toward
0.48
Towards
0.45
Towards
0.44
Tow
0.33
closer
0.33
åIJij
0.29
owards
0.29
oward
0.28
hacia
0.28
Activations Density 0.138%