INDEX
Explanations
instances of the word "policy" occurring in the text
references to policy-related issues and discussions
New Auto-Interp
Negative Logits
Stain
-0.76
Ñĭ
-0.76
ingly
-0.70
batch
-0.67
Cruise
-0.67
DAY
-0.66
iculture
-0.66
apy
-0.65
orf
-0.65
ibles
-0.64
POSITIVE LOGITS
making
1.24
makers
1.19
maker
1.10
makers
1.10
prescriptions
1.06
holders
1.02
stances
0.97
holder
0.95
choices
0.92
advisor
0.91
Activations Density 0.051%