INDEX
Explanations
words related to negative or critical opinions or actions
references to denial
New Auto-Interp
Negative Logits
ramid
-0.66
Annotations
-0.65
Trooper
-0.63
lda
-0.61
Carbuncle
-0.60
committee
-0.60
LER
-0.59
EED
-0.59
intendent
-0.58
permitting
-0.58
POSITIVE LOGITS
unci
1.53
unciation
1.49
izens
1.45
ounces
1.20
igration
1.19
uclear
1.18
izen
1.15
ormal
1.15
atural
1.14
igrated
1.10
Activations Density 0.016%