INDEX
Negative Logits
ه
0.80
ة
0.67
ب
0.65
Czas
0.64
a
0.64
ስ
0.64
Мы
0.63
Pep
0.63
ture
0.61
risi
0.61
POSITIVE LOGITS
langu
0.61
(
0.61
postcard
0.59
,
0.59
↵
0.58
ме
0.57
चणी
0.55
land
0.55
$<
0.55
̃
0.54
Activations Density 0.001%
ه
ة
ب
Czas
a
ስ
Мы
Pep
ture
risi
langu
(
postcard
,
↵
ме
चणी
land
$<
̃