INDEX
Negative Logits
reserves
-0.08
turbine
-0.07
storms
-0.07
welfare
-0.07
Bus
-0.07
edge
-0.06
words
-0.06
boats
-0.06
robin
-0.06
reservations
-0.06
POSITIVE LOGITS
dating
0.08
Dating
0.07
/autoload
0.06
đàn
0.06
Î
0.06
ان
0.06
(COLOR
0.06
⌒
0.06
(sz
0.06
dating
0.06
Activations Density 0.005%