INDEX
Negative Logits
으로
-0.70
nare
-0.69
ness
-0.60
vare
-0.57
iare
-0.57
äre
-0.57
s
-0.56
ی
-0.55
nant
-0.54
Bare
-0.54
POSITIVE LOGITS
ſelves
0.67
PerformLayout
0.67
myſelf
0.66
Anſ
0.60
Efq
0.59
utafitiHapana
0.59
insatz
0.57
otomatig
0.56
himſelf
0.56
eeeeeeee
0.55
Activations Density 0.310%