INDEX
Explanations
HTML tags and structure elements in code
New Auto-Interp
Negative Logits
dafx
-0.69
]));
-0.68
+:+
-0.67
PYX
-0.65
}")
-0.65
")]
-0.65
')],
-0.63
ReferenceEquals
-0.62
'):
-0.60
certe
-0.59
POSITIVE LOGITS
↵↵↵
0.82
↵
0.70
↵↵↵↵
0.65
↵↵
0.64
↵↵↵↵↵
0.63
↵↵↵↵↵↵
0.57
↵↵↵↵↵↵↵↵↵↵↵
0.53
↵↵↵↵↵↵↵
0.52
↵↵↵↵↵↵↵↵↵↵
0.50
↵↵↵↵↵↵↵↵
0.50
Activations Density 0.083%