INDEX
Explanations
references to political actions and policies related to healthcare and leadership decisions
New Auto-Interp
Negative Logits
lsen
-0.16
ourt
-0.15
ensor
-0.15
onet
-0.15
diet
-0.15
igham
-0.14
alt
-0.14
nech
-0.14
amen
-0.14
Diet
-0.14
POSITIVE LOGITS
inois
0.15
-*-č↵
0.15
avax
0.14
(PC
0.13
umni
0.13
_EXTENDED
0.13
ucch
0.13
ovel
0.13
dood
0.13
acement
0.13
Activations Density 0.134%