INDEX
Negative Logits
C
0.83
지
0.80
W
0.80
يز
0.78
기
0.75
g
0.74
M
0.72
t
0.71
Қ
0.70
D
0.69
POSITIVE LOGITS
мень
0.59
ästä
0.59
\
0.57
contacter
0.57
🏻
0.56
bantuan
0.55
роках
0.55
verden
0.55
बचने
0.55
rattling
0.54
Activations Density 0.004%
C
지
W
يز
기
g
M
t
Қ
D
мень
ästä
\
contacter
🏻
bantuan
роках
verden
बचने
rattling