INDEX
Explanations
syntax elements, particularly parentheses and braces, in code or programming-related text
New Auto-Interp
Negative Logits
ogonal
-0.85
bezpiecz
-0.83
aarrggbb
-0.83
Langdon
-0.82
Coffin
-0.82
beginnetje
-0.82
OfBirth
-0.82
помним
-0.81
fucked
-0.79
fuck
-0.79
POSITIVE LOGITS
']))
1.30
()))
1.15
()
1.06
)))
1.00
)
0.99
--)
0.99
]
0.98
']?>
0.97
++)
0.96
))
0.95
Activations Density 0.074%