INDEX
Explanations
lines of code or comments in a programming context
New Auto-Interp
Negative Logits
—,
-0.78
Bona
-0.76
ation
-0.72
nas
-0.70
istani
-0.70
ona
-0.68
Bon
-0.67
cillo
-0.66
—,
-0.66
al
-0.66
POSITIVE LOGITS
///
1.66
///
1.33
///
1.02
/////
0.90
///<
0.84
ায়
0.84
phazard
0.84
ţiile
0.81
yto
0.80
്
0.79
Activations Density 0.042%