INDEX
Negative Logits
dden
-0.78
Argon
-0.73
Continent
-0.71
rose
-0.70
Wit
-0.67
istics
-0.64
CTV
-0.63
Passion
-0.62
zzy
-0.62
Cats
-0.61
POSITIVE LOGITS
authorizing
1.20
prohibiting
1.19
banning
1.13
restricting
1.09
issued
1.03
barring
1.00
forb
0.98
granting
0.98
limiting
0.97
legalizing
0.96
Activations Density 0.111%