INDEX
Negative Logits
y
0.71
أ
0.71
1
0.70
ب
0.70
م
0.68
in
0.67
ной
0.66
تح
0.66
ő
0.66
In
0.65
POSITIVE LOGITS
donut
1.05
Donuts
1.00
🍩
0.99
donut
0.93
donuts
0.91
doughnuts
0.84
st
0.81
doughnut
0.79
ва
0.79
anez
0.78
Activations Density 0.002%
y
أ
1
ب
م
in
ной
تح
ő
In
donut
Donuts
🍩
donut
donuts
doughnuts
st
doughnut
ва
anez