INDEX
Explanations
negations related to same-sex marriage and legal rights
New Auto-Interp
Negative Logits
brim
-0.73
Cotton
-0.64
slack
-0.64
nerv
-0.62
Loren
-0.62
Polo
-0.61
Brach
-0.60
cigars
-0.59
IPM
-0.59
sacrific
-0.58
POSITIVE LOGITS
origin
1.15
colored
1.10
sided
1.06
sized
1.05
dimensional
0.98
entity
0.97
named
0.94
sex
0.94
gender
0.93
minded
0.92
Activations Density 0.008%