INDEX
Negative Logits
Norm
0.41
sev
0.39
NOK
0.38
Hock
0.37
rabbit
0.36
Sev
0.35
Norm
0.35
äk
0.35
Exotic
0.34
Fe
0.34
POSITIVE LOGITS
hers
0.39
slack
0.38
㽡
0.38
вища
0.38
pans
0.37
जू
0.37
躋
0.37
owning
0.36
útbol
0.36
Hers
0.36
Activations Density 0.000%