INDEX
Explanations
take followed by a preposition
New Auto-Interp
Negative Logits
ру
0.30
sı
0.28
inje
0.26
LookAnd
0.26
arrondies
0.25
їн
0.25
traducir
0.25
неоп
0.25
acciones
0.24
вай
0.24
POSITIVE LOGITS
advantage
0.44
care
0.36
a
0.35
overs
0.31
ventaja
0.30
advantage
0.30
Vorteil
0.29
advant
0.28
up
0.28
effect
0.27
Activations Density 0.022%