INDEX
Explanations
moving into or through places
New Auto-Interp
Negative Logits
Coming
0.21
oy
0.20
estuv
0.19
₁+
0.19
Going
0.19
Gdy
0.19
terletak
0.19
estando
0.19
Staying
0.19
avec
0.18
POSITIVE LOGITS
into
0.74
into
0.56
इनटू
0.54
vào
0.53
INTO
0.50
onto
0.48
Into
0.47
Into
0.46
away
0.44
through
0.43
Activations Density 0.169%