INDEX
Negative Logits
/TR
-0.07
arrest
-0.07
Cul
-0.06
artist
-0.06
emission
-0.06
hall
-0.06
ה
-0.06
Harlem
-0.06
accelerated
-0.06
therapists
-0.06
POSITIVE LOGITS
sponge
0.18
Sponge
0.13
ponge
0.11
.sponge
0.08
Spy
0.07
ğiz
0.07
Сп
0.07
πο
0.07
ніп
0.07
구성
0.07
Activations Density 0.001%