INDEX
Explanations
occurrences of the word "patch" and its variants
New Auto-Interp
Negative Logits
D
-0.65
I
-0.64
(
-0.62
<eos>
-0.62
-
-0.61
P
-0.59
J
-0.59
C
-0.58
-0.58
L
-0.58
POSITIVE LOGITS
patches
1.26
patch
1.25
―――――
1.25
Theſe
1.23
Monfieur
1.23
Efq
1.20
بيها
1.20
Anſ
1.19
Patches
1.17
Reſ
1.16
Activations Density 0.166%