INDEX
Negative Logits
déclaration
0.40
hate
0.39
Simplifying
0.38
घटना
0.38
আসনে
0.38
declared
0.38
declared
0.37
Declaration
0.36
تعمیر
0.36
طلعت
0.36
POSITIVE LOGITS
Sak
0.80
Sak
0.80
sak
0.75
Sac
0.70
sac
0.66
Hokkaido
0.66
sak
0.65
Sac
0.63
SAC
0.60
sac
0.59
Activations Density 0.005%