INDEX
Negative Logits
hips
-0.83
raints
-0.70
eters
-0.70
rongh
-0.68
ority
-0.66
seless
-0.62
shall
-0.60
recent
-0.59
strongly
-0.59
notations
-0.59
POSITIVE LOGITS
Jet
1.23
going
1.21
prey
0.87
jet
0.80
access
0.75
accessibility
0.74
azon
0.73
sailing
0.72
wallet
0.68
enough
0.68
Activations Density 0.049%