INDEX
Explanations
phrases indicating support or approval
phrases related to support or endorsement of various policies or causes
New Auto-Interp
Negative Logits
DragonMagazine
-0.80
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
-0.74
ngth
-0.69
Buzz
-0.65
opard
-0.64
semble
-0.63
VI
-0.63
ertation
-0.63
stumble
-0.62
hemer
-0.61
POSITIVE LOGITS
abol
1.22
decriminal
1.15
legalizing
1.12
banning
1.07
repeal
1.03
stricter
1.03
abolition
1.01
inclusion
1.01
legalization
1.00
repealing
0.99
Activations Density 0.299%