INDEX
Explanations
instances of parentheses in the text
New Auto-Interp
Negative Logits
}]);
-0.65
")))
-0.62
'));
-0.61
']))
-0.58
Stu
-0.57
]));
-0.57
сто
-0.57
Selatan
-0.57
')))
-0.56
to
-0.55
POSITIVE LOGITS
">(</
1.47
(\
1.42
(
1.41
__(
1.39
>(</
1.38
}^{(1.37
("(1.33
($(
1.30
}(
1.27
($(
1.27
Activations Density 1.288%