INDEX
Explanations
gerunds or present participles indicating ongoing actions or processes
New Auto-Interp
Negative Logits
/remove
-0.25
ings
-0.23
ÂŃing
-0.18
/write
-0.18
veloper
-0.17
INGS
-0.17
/disable
-0.17
/close
-0.17
/run
-0.17
/delete
-0.16
POSITIVE LOGITS
ly
0.31
/testing
0.21
/loading
0.21
/logging
0.18
LY
0.18
oneself
0.18
/up
0.17
ny
0.17
ness
0.16
/rem
0.16
Activations Density 1.292%