INDEX
Negative Logits
are
0.79
ö
0.75
.
0.70
ir
0.68
mi
0.66
.<
0.63
A
0.62
have
0.62
haue
0.61
ఆంధ్ర
0.59
POSITIVE LOGITS
ed
0.79
اعر
0.79
τε
0.74
రు
0.69
že
0.69
čo
0.68
戎
0.68
efeller
0.67
trajet
0.67
و
0.66
Activations Density 0.001%
are
ö
.
ir
mi
.<
A
have
haue
ఆంధ్ర
ed
اعر
τε
రు
že
čo
戎
efeller
trajet
و