INDEX
Explanations
put ' to' or similar structure
New Auto-Interp
Negative Logits
tragen
1.00
construido
0.91
zumindest
0.86
costruito
0.86
zonder
0.83
múltiplos
0.83
ohne
0.82
Poké
0.81
gebaut
0.79
vollständig
0.79
POSITIVE LOGITS
е
1.09
о
0.92
ет
0.76
ri
0.75
ні
0.72
r
0.72
го
0.72
ти
0.70
mi
0.69
Совет
0.69
Activations Density 0.003%