INDEX
Explanations
the presence of progressive verbs and actions
New Auto-Interp
Negative Logits
/remove
-0.24
/delete
-0.20
avr
-0.18
/write
-0.18
/disable
-0.17
heimer
-0.17
ÂŃing
-0.17
/read
-0.16
/change
-0.16
ings
-0.16
POSITIVE LOGITS
/loading
0.23
ly
0.23
/testing
0.19
/logging
0.17
g
0.16
LY
0.16
toward
0.15
Verg
0.15
redient
0.15
redients
0.15
Activations Density 0.452%