INDEX
Explanations
phrases indicating purpose or specification
New Auto-Interp
Negative Logits
ectin
0.40
\\
0.39
invitados
0.38
Beverungen
0.38
enos
0.37
vorbe
0.37
وهكذا
0.37
اموش
0.37
Conn
0.37
securely
0.37
POSITIVE LOGITS
with
0.50
ಹೇಗೆ
0.49
without
0.48
без
0.48
on
0.47
vs
0.47
versus
0.47
недостатки
0.46
для
0.46
for
0.45
Activations Density 0.001%