INDEX
Explanations
words related to governmental control or oppression
concepts related to forms of repression and oppressive systems
New Auto-Interp
Negative Logits
PATH
-0.76
TAIN
-0.73
tein
-0.72
Fair
-0.71
gran
-0.69
leaf
-0.68
litter
-0.68
lder
-0.68
pool
-0.67
laus
-0.67
POSITIVE LOGITS
repression
1.37
oppression
0.95
repressive
0.93
suppression
0.88
regimes
0.86
dictatorship
0.82
crackdown
0.80
apparatus
0.80
brutality
0.80
dictators
0.76
Activations Density 0.018%