INDEX
Negative Logits
Gir
-0.68
ners
-0.63
Imp
-0.63
5
-0.63
alan
-0.62
Bren
-0.62
Gir
-0.62
Bot
-0.60
آم
-0.60
Gour
-0.60
POSITIVE LOGITS
Raim
1.38
Ra
1.25
RAID
1.23
Ra
1.20
Raider
1.19
RA
1.18
raisins
1.18
Raub
1.13
raider
1.13
Raoul
1.11
Activations Density 0.025%