INDEX
Explanations
references to lists or definitions in a structured format
New Auto-Interp
Negative Logits
Garrick
-0.81
Shiva
-0.76
Shiva
-0.75
Salvador
-0.75
Kass
-0.74
Oakley
-0.73
casio
-0.73
meau
-0.73
Nixon
-0.71
Ernesto
-0.71
POSITIVE LOGITS
DL
1.27
dl
1.22
DL
1.20
dl
1.05
Dl
0.77
KX
0.71
theless
0.69
Portale
0.69
ύν
0.68
Zinn
0.66
Activations Density 0.005%