INDEX
Explanations
phrases involving prepositions and roles
New Auto-Interp
Negative Logits
ene
0.50
ens
0.45
ነገር
0.45
ඉතා
0.44
ღვ
0.43
ओह
0.43
oligodend
0.42
েরিক
0.42
contrats
0.42
iter
0.41
POSITIVE LOGITS
тия
0.50
kanyang
0.47
Prasad
0.45
осо
0.43
yandan
0.42
жения
0.39
UO
0.39
гда
0.39
લાભ
0.38
sua
0.38
Activations Density 0.002%