INDEX
Explanations
parentheses and other opening symbols
New Auto-Interp
Negative Logits
iſt
-0.82
Beſ
-0.78
")))
-0.75
'));
-0.74
}]);
-0.74
―――――
-0.72
%");
-0.68
$_"
-0.68
Theſe
-0.67
Diſ
-0.67
POSITIVE LOGITS
(
1.69
(\
1.52
">(</
1.51
>(</
1.49
(
1.48
(
1.48
}^{(1.42
-(
1.40
{(1.36
__(
1.35
Activations Density 1.387%