INDEX
Explanations
code-related elements, specifically import statements and directives in programming languages
New Auto-Interp
Negative Logits
LookAnd
-1.03
ModelExpression
-0.98
IsMutable
-0.89
DockStyle
-0.88
principalColumn
-0.88
oprot
-0.84
mybatisplus
-0.84
хьтан
-0.84
Хьажоргаш
-0.83
Efq
-0.83
POSITIVE LOGITS
import
0.91
<b>
0.66
↵↵
0.65
<strong>
0.65
[toxicity=0]
0.61
<h1>
0.60
</tr>
0.59
↵↵↵↵
0.57
<h4>
0.56
//
0.56
Activations Density 0.076%