INDEX
Explanations
nested structures and patterns, particularly in programming or mathematical expressions
New Auto-Interp
Negative Logits
""))
-0.70
')
-0.68
...
-0.67
"")
-0.64
isen
-0.64
()")
-0.62
$",
-0.62
)")
-0.61
</s>
-0.60
:])
-0.59
POSITIVE LOGITS
{1.88
{1.73
="{1.66
>{1.58
[{1.53
({1.51
={1.50
("{1.50
"{1.49
/{1.46
Activations Density 0.748%