INDEX
Explanations
verbs followed by prepositions
New Auto-Interp
Negative Logits
Раз
0.90
<unused2197>
0.83
например
0.82
Мы
0.81
Про
0.80
Aunque
0.79
Мне
0.78
рекомендуется
0.77
虽然
0.76
предназначен
0.75
POSITIVE LOGITS
every
1.26
into
1.13
themselves
1.10
fewer
1.09
differently
1.09
puns
1.05
lessly
1.03
ously
1.00
onto
1.00
everytime
0.98
Activations Density 0.306%