INDEX
Explanations
commands or instructions to perform certain actions
phrases related to instructions or actions to take
New Auto-Interp
Negative Logits
ylum
-0.86
arious
-0.70
blooded
-0.66
stood
-0.65
lasts
-0.64
Jud
-0.64
leaders
-0.63
hearted
-0.63
goodwill
-0.63
democracy
-0.63
POSITIVE LOGITS
configure
1.37
manually
1.36
use
1.35
specify
1.35
modify
1.29
utilize
1.26
customize
1.25
install
1.25
choose
1.22
omit
1.20
Activations Density 0.340%