INDEX
Explanations
the word "behind" and, to a lesser extent, words related to publishing material
New Auto-Interp
Negative Logits
behind
-2.50
behind
-2.27
Behind
-2.11
Behind
-2.06
BEHIND
-2.03
derrière
-1.94
detrás
-1.75
dietro
-1.72
<bos>
-1.46
bakom
-1.45
POSITIVE LOGITS
}`}
0.53
Starting
0.51
omge
0.47
zase
0.47
labdar
0.47
voet
0.46
vyk
0.44
ışık
0.44
dom
0.43
Altman
0.43
Activations Density 2.545%