INDEX
Negative Logits
hood
-0.74
LOAD
-0.69
senal
-0.68
士
-0.67
Nightmares
-0.66
iard
-0.66
Advertisement
-0.65
igham
-0.64
ãĥ¯
-0.64
sburgh
-0.64
POSITIVE LOGITS
roph
1.18
rament
1.13
igmat
1.11
rology
1.08
rolog
1.04
ron
1.04
ro
1.00
rop
0.98
ral
0.98
rob
0.96
Activations Density 0.024%