INDEX
Explanations
phrases indicating actions or commands
New Auto-Interp
Negative Logits
nces
-0.69
wives
-0.65
andowski
-0.64
ELD
-0.63
soType
-0.62
soDeliveryDate
-0.62
ensive
-0.61
ccording
-0.60
Klu
-0.60
outwe
-0.59
POSITIVE LOGITS
ggles
1.30
activate
1.24
maximize
1.09
obtain
1.04
locate
1.04
toggle
1.03
ensure
1.03
customize
1.02
create
1.01
retrieve
1.01
Activations Density 0.049%