INDEX
Explanations
sections related to political discourse and government actions
New Auto-Interp
Negative Logits
aic
-0.71
Window
-0.70
omorphic
-0.68
consequential
-0.65
igo
-0.62
artisan
-0.62
Gothic
-0.61
emetery
-0.61
ORN
-0.59
RAW
-0.58
POSITIVE LOGITS
took
1.19
knew
1.17
chose
1.16
withdrew
1.15
gave
1.13
went
1.13
flew
1.12
wore
1.10
came
1.08
enjoys
1.08
Activations Density 0.283%