INDEX
Explanations
words related to societal issues, oppression, and political commentary
New Auto-Interp
Negative Logits
cloth
-0.77
BOOK
-0.72
book
-0.66
friend
-0.64
manship
-0.62
lihood
-0.62
words
-0.62
GAME
-0.60
soDeliveryDate
-0.60
rooms
-0.59
POSITIVE LOGITS
ized
2.22
ization
2.18
izing
2.13
istic
1.99
ize
1.88
ism
1.86
izes
1.85
ists
1.84
isation
1.84
istically
1.82
Activations Density 3.624%