INDEX
Explanations
code-related function calls and their parameters
New Auto-Interp
Negative Logits
"))
-1.36
}")
-1.32
")
-1.29
"){
-1.24
'))
-1.23
")
-1.22
!")
-1.19
"},
-1.18
")]
-1.16
'){
-1.15
POSITIVE LOGITS
);
1.79
());
1.47
');
1.45
.);
1.45
");
1.44
);
1.42
`);
1.33
%);
1.33
_);
1.31
);
1.29
Activations Density 0.738%