INDEX
Explanations
terms related to awe and wonder
terms related to gender, specifically emphasizing the concept of "woman."
New Auto-Interp
Negative Logits
ebook
-0.63
packing
-0.61
screen
-0.60
ENTER
-0.60
Hampton
-0.59
OUT
-0.59
sheet
-0.57
https
-0.57
pasture
-0.56
Beautiful
-0.56
POSITIVE LOGITS
omen
1.26
ovic
0.86
stru
0.85
oshenko
0.84
opol
0.83
nant
0.83
oso
0.83
eday
0.82
iak
0.81
nown
0.81
Activations Density 0.005%