INDEX
Explanations
patterns of repetition in data
New Auto-Interp
Negative Logits
itſelf
-1.74
myſelf
-1.72
―――――
-1.60
Monfieur
-1.56
doubtnut
-1.55
themſelves
-1.53
Theſe
-1.49
Jefus
-1.46
becauſe
-1.45
Anſ
-1.44
POSITIVE LOGITS
</strong>
0.82
(
0.80
.
0.79
I
0.79
</b>
0.79
c
0.76
M
0.76
and
0.75
G
0.75
C
0.74
Activations Density 0.350%