INDEX
Explanations
words related to being against something or having a negative connotation
terms related to prohibition or restrictive measures
New Auto-Interp
Negative Logits
ail
-0.67
anchor
-0.65
airst
-0.64
Shades
-0.63
bottled
-0.63
coping
-0.62
enjoyed
-0.61
shr
-0.61
Ic
-0.60
slashed
-0.60
POSITIVE LOGITS
pro
3.87
Pro
2.17
PRO
1.62
Pro
1.61
prop
1.46
PRO
1.43
prot
1.38
pro
1.29
prof
1.19
pr
1.17
Activations Density 0.012%