INDEX
Negative Logits
ିକ
1.45
(?
1.27
৪
1.24
(
1.20
ιών
1.16
গত
1.13
ally
1.11
UG
1.09
(“
1.09
zana
1.04
POSITIVE LOGITS
ن
2.17
т
1.75
s
1.64
ar
1.58
pesky
1.55
č
1.52
د
1.51
क
1.50
ع
1.49
го
1.48
Activations Density 0.003%
ିକ
(?
৪
(
ιών
গত
ally
UG
(“
zana
ن
т
s
ar
pesky
č
د
क
ع
го