INDEX
Explanations
the word "normal" or variations of it
the word "normal" and its various contexts related to societal standards and behaviors
New Auto-Interp
Negative Logits
better
-0.74
hani
-0.70
artisan
-0.68
leted
-0.66
raped
-0.64
Sov
-0.64
intel
-0.63
resent
-0.62
hung
-0.61
Winged
-0.61
POSITIVE LOGITS
cy
1.50
ization
1.44
izing
1.40
isation
1.35
ised
1.34
izes
1.30
ize
1.23
ising
1.21
ized
1.19
izations
1.12
Activations Density 0.044%