INDEX
Explanations
specific formatting and syntax elements related to programming or code structure
New Auto-Interp
Negative Logits
Houſe
-1.02
Efq
-0.97
ſelf
-0.93
―――――
-0.92
ſeveral
-0.91
Reſ
-0.91
Diſ
-0.90
myſelf
-0.89
Theſe
-0.88
Datuak
-0.88
POSITIVE LOGITS
,
0.64
.
0.62
far
0.61
0.58
(
0.56
Far
0.53
↵
0.52
ly
0.50
py
0.49
mente
0.49
Activations Density 0.026%