INDEX
Negative Logits
ipation
0.41
들
0.39
Nich
0.38
Didn
0.35
deprive
0.35
заня
0.34
대한
0.34
مادر
0.34
능
0.34
ريقة
0.33
POSITIVE LOGITS
FF
0.77
BB
0.73
MM
0.72
GG
0.71
LL
0.71
KK
0.70
HH
0.70
BBB
0.67
VV
0.67
JJ
0.66
Activations Density 0.049%
ipation
들
Nich
Didn
deprive
заня
대한
مادر
능
ريقة
FF
BB
MM
GG
LL
KK
HH
BBB
VV
JJ