INDEX
Explanations
terms related to "patch" or modifications in various contexts
New Auto-Interp
Negative Logits
JI
-0.16
riers
-0.15
igel
-0.15
pga
-0.15
zing
-0.14
axis
-0.14
bis
-0.14
ssue
-0.14
studs
-0.14
rawer
-0.14
POSITIVE LOGITS
work
0.37
(patch
0.29
Patch
0.28
patch
0.26
y
0.26
worked
0.25
(es
0.25
Patch
0.24
ogue
0.24
works
0.23
Activations Density 0.013%