INDEX
Explanations
references to government and leadership
New Auto-Interp
Negative Logits
ç
-0.15
ive
-0.15
ard
-0.15
ctl
-0.15
ore
-0.15
imagination
-0.14
went
-0.14
hz
-0.14
ier
-0.14
ally
-0.14
POSITIVE LOGITS
exact
0.24
move
0.24
Exact
0.18
moves
0.18
precise
0.18
exact
0.17
move
0.17
Epoch
0.17
announcement
0.17
timing
0.16
Activations Density 0.113%