INDEX
Explanations
definitely followed by a verb
New Auto-Interp
Negative Logits
i
1.52
eers
1.20
eer
1.14
ição
1.09
presentar
1.07
日から
1.07
ių
1.05
iş
1.03
formar
1.02
oo
1.02
POSITIVE LOGITS
д
1.16
за
1.14
ية
1.13
га
1.13
𝒑
1.09
ס
1.09
ם
1.07
มี
1.05
м
1.05
פ
1.04
Activations Density 0.222%