INDEX
Explanations
phrases that indicate moving forward or continuing with a specific action or plan
New Auto-Interp
Negative Logits
eson
-0.16
cus
-0.16
alach
-0.16
ussed
-0.15
bell
-0.14
ucht
-0.14
k
-0.14
going
-0.14
gren
-0.14
gon
-0.14
POSITIVE LOGITS
with
0.20
toward
0.18
with
0.18
dengan
0.17
vỼi
0.16
swith
0.16
towards
0.16
ure
0.16
proced
0.16
withd
0.15
Activations Density 0.034%