INDEX
Explanations
programming-related annotations and comments in code
New Auto-Interp
Negative Logits
StructEnd
-0.72
transQ
-0.70
PreferredItem
-0.68
ArrowToggle
-0.68
AndEndTag
-0.68
يميديا
-0.67
twimg
-0.65
رشف
-0.63
ویکیپدی
-0.62
adpleegd
-0.60
POSITIVE LOGITS
*
0.83
*@
0.77
*@
0.76
*
0.69
**
0.56
**
0.52
*"
0.51
MIT
0.49
lishes
0.46
*\
0.45
Activations Density 0.072%