INDEX
Explanations
phrases related to political and societal structures or events
New Auto-Interp
Negative Logits
enance
-0.56
ocracy
-0.53
killers
-0.51
ById
-0.50
phabet
-0.49
unfocusedRange
-0.48
rise
-0.48
ylum
-0.48
cleaners
-0.48
pronouns
-0.48
POSITIVE LOGITS
fil
0.62
advert
0.61
haps
0.53
Tours
0.51
JECT
0.50
azed
0.50
jured
0.50
ased
0.50
jected
0.49
active
0.48
Activations Density 8.929%