INDEX
Explanations
verbs that suggest taking action or making progress
phrases that indicate actions initiated by a subject
New Auto-Interp
Negative Logits
adal
-0.67
Silver
-0.65
ario
-0.63
ime
-0.62
>)
-0.62
atal
-0.62
atem
-0.60
jon
-0.60
zman
-0.59
atl
-0.59
POSITIVE LOGITS
virtue
1.27
placing
1.17
allowing
1.17
adding
1.13
eliminating
1.13
creating
1.13
removing
1.11
providing
1.10
sacrificing
1.08
products
1.08
Activations Density 0.102%