INDEX
Explanations
words related to negative or harmful situations or conditions
references to unwanted or unwelcome experiences and situations
New Auto-Interp
Negative Logits
lass
-0.87
hetti
-0.83
mberg
-0.82
mun
-0.81
alach
-0.81
ysics
-0.80
opsy
-0.80
sis
-0.79
urgy
-0.78
oan
-0.78
POSITIVE LOGITS
unwanted
1.32
pregnancies
1.17
pregnancy
0.94
ãĤī
0.87
unwelcome
0.83
Parenthood
0.82
Jagu
0.80
adolesc
0.78
indisc
0.75
offspring
0.72
Activations Density 0.007%