INDEX
Explanations
terms related to power dynamics and hierarchies, including words related to government structures and brand power
New Auto-Interp
Negative Logits
Von
-0.75
eday
-0.69
eryl
-0.69
algia
-0.67
auder
-0.65
eret
-0.65
ALK
-0.64
akeru
-0.64
Bei
-0.63
roit
-0.63
POSITIVE LOGITS
houses
1.16
stroke
1.01
lifting
1.00
lessness
0.97
outage
0.91
wielded
0.90
puff
0.89
plant
0.88
FUL
0.86
vested
0.85
Activations Density 0.584%