INDEX
Explanations
the term "back" in various contexts and phrases
New Auto-Interp
Negative Logits
xbb
-0.15
depend
-0.15
oron
-0.15
abel
-0.14
LOCKS
-0.14
uyen
-0.14
allis
-0.14
-vous
-0.14
abelle
-0.14
opot
-0.13
POSITIVE LOGITS
slash
0.21
ness
0.20
side
0.19
/back
0.18
ronym
0.18
ed
0.18
iw
0.18
eo
0.17
slashes
0.17
yr
0.16
Activations Density 0.069%