INDEX
Explanations
words related to social issues and activism such as legitimizing, tolerating, destabilizing, and revitalizing
New Auto-Interp
Negative Logits
HOU
-0.71
cloth
-0.64
Meadows
-0.62
uden
-0.60
erity
-0.60
GAME
-0.60
words
-0.59
tower
-0.56
Ank
-0.56
Aad
-0.56
POSITIVE LOGITS
ized
2.52
ization
2.52
izing
2.48
izations
2.28
izes
2.24
izers
2.20
isation
2.17
ize
2.12
ised
2.11
izer
2.06
Activations Density 1.791%