INDEX
Explanations
occurrences of the word "denounce" or its variations, indicating disapproval or condemnation
New Auto-Interp
Negative Logits
hea
-0.16
ÅŁk
-0.16
inner
-0.14
å·
-0.14
rossover
-0.14
alam
-0.14
quets
-0.14
agner
-0.14
bben
-0.14
lice
-0.14
POSITIVE LOGITS
unc
0.30
ouncing
0.28
ounce
0.28
unciation
0.26
unci
0.26
ouncements
0.24
ounces
0.23
iers
0.22
ational
0.22
ounc
0.22
Activations Density 0.005%