INDEX
Explanations
phrases related to political power dynamics and influence
New Auto-Interp
Head Attr Weights
0:0.30
1:0.02
2:0.27
3:0.07
4:0.03
5:0.04
6:0.02
7:0.04
8:0.03
9:0.03
10:0.07
11:0.02
Negative Logits
eeds
-2.73
Plot
-2.66
Stat
-2.63
plots
-2.59
Cipher
-2.52
yton
-2.52
atana
-2.47
Points
-2.41
bold
-2.39
eed
-2.37
POSITIVE LOGITS
vacuum
5.21
Vac
4.05
vacancy
3.84
vac
3.77
vacancies
3.32
cleaners
3.16
emptied
3.13
vacated
3.12
expel
3.03
uum
2.92
Activations Density 0.000%