INDEX
Explanations
front or before in multiple languages
New Auto-Interp
Negative Logits
пÑĢежде
-0.11
PRI
-0.10
Prior
-0.10
overhead
-0.09
athers
-0.09
antic
-0.09
buzz
-0.09
prior
-0.09
ereo
-0.08
Ned
-0.08
POSITIVE LOGITS
devant
0.54
front
0.47
frente
0.42
front
0.38
ante
0.30
ìķŀ
0.30
пеÑĢед
0.30
åīį
0.30
_front
0.29
önünde
0.28
Activations Density 0.134%