INDEX
Explanations
url paths with tokens enclosed in curly brackets
New Auto-Interp
Negative Logits
/$
-1.54
/";
-1.52
/"+
-1.49
/
-1.48
/"
-1.48
/${-1.45
/'
-1.41
/")
-1.41
/%
-1.41
/");
-1.41
POSITIVE LOGITS
↵↵
0.98
0.93
↵
0.92
"
0.84
0.80
The
0.79
.
0.74
I
0.68
a
0.65
'
0.63
Activations Density 0.891%