INDEX
Negative Logits
ER
0.75
3
0.73
لا
0.70
س
0.69
2
0.68
5
0.67
4
0.66
يت
0.66
До
0.66
и
0.66
POSITIVE LOGITS
,
0.79
↵
0.77
↵↵
0.75
hedon
0.71
Pearce
0.68
perpetuated
0.67
Ironically
0.65
desses
0.64
(=
0.63
verwend
0.62
Activations Density 0.462%
ER
3
لا
س
2
5
4
يت
До
и
,
↵
↵↵
hedon
Pearce
perpetuated
Ironically
desses
(=
verwend