INDEX
Explanations
references to various programming and technical concepts
New Auto-Interp
Negative Logits
InitVars
-0.84
bootstrapcdn
-0.80
DockStyle
-0.77
BorderRadius
-0.75
InputBorder
-0.73
ENEFITS
-0.72
getY
-0.71
elebr
-0.70
myModal
-0.70
loyment
-0.69
POSITIVE LOGITS
↵
1.21
↵↵
0.90
<eos>
0.78
↵↵↵
0.77
}));
0.76
})));
0.75
</tr>
0.73
[toxicity=0]
0.73
<tbody>
0.72
↵↵↵↵
0.71
Activations Density 0.200%