INDEX
Negative Logits
bernama
0.82
ם
0.64
them
0.61
므로
0.61
lilac
0.61
으며
0.59
ных
0.59
aniyam
0.58
named
0.58
beanie
0.57
POSITIVE LOGITS
s
0.92
0
0.79
K
0.73
ر
0.68
LL
0.57
}")
0.57
P
0.56
},
0.55
}*/
0.55
Rankings
0.55
Activations Density 0.015%
bernama
ם
them
므로
lilac
으며
ных
aniyam
named
beanie
s
0
K
ر
LL
}")
P
},
}*/
Rankings