INDEX
Explanations
occurrences of verbs related to achieving or meeting criteria
New Auto-Interp
Negative Logits
Named
-0.23
Designed
-0.22
_named
-0.22
Named
-0.21
wrote
-0.20
Designed
-0.20
Asked
-0.19
designed
-0.19
kill
-0.18
downgrade
-0.18
POSITIVE LOGITS
deleted
0.29
withdrawn
0.26
altered
0.26
severed
0.26
destroyed
0.25
removed
0.25
changed
0.24
revoked
0.24
deleted
0.24
roken
0.24
Activations Density 0.231%