INDEX
Explanations
closing parenthesis followed by asterisks
New Auto-Interp
Negative Logits
s
1.73
ség
1.73
schaft
1.70
{~1.62
keyst
1.47
shower
1.45
ಾನೂ
1.45
J
1.41
showering
1.41
adduced
1.34
POSITIVE LOGITS
ه
3.05
л
2.92
ת
2.59
د
2.52
𝓲
2.39
ی
2.36
هي
2.31
ி
2.25
i
2.22
ി
2.20
Activations Density 0.380%