INDEX
Explanations
YouTube video links and associated metadata
New Auto-Interp
Negative Logits
!")
-0.81
."]
-0.79
—
-0.79
leſs
-0.78
—
-0.76
`{.-0.76
"])
-0.74
".
-0.74
ویکیپدیا
-0.73
$")
-0.73
POSITIVE LOGITS
Z
1.05
Y
1.04
q
1.00
j
1.00
Q
0.99
J
0.96
z
0.94
K
0.91
k
0.90
U
0.90
Activations Density 0.412%