INDEX
Explanations
words related to complaints or criticism
words related to complaints or issues
New Auto-Interp
Negative Logits
plane
-0.64
²¾
-0.62
watered
-0.61
thinly
-0.59
racially
-0.58
yip
-0.58
pivot
-0.58
sal
-0.57
hormones
-0.57
pps
-0.57
POSITIVE LOGITS
acent
1.53
icating
1.53
imentary
1.35
icates
1.35
icated
1.34
aints
1.30
icit
1.27
iance
1.25
ainer
1.21
aint
1.21
Activations Density 0.032%