INDEX
Explanations
slashes or path-like structures in code or data
url paths and separators
New Auto-Interp
Negative Logits
dezelve
-0.56
Anſ
-0.56
Houſe
-0.55
feroit
-0.52
ſelf
-0.52
Perſ
-0.51
ſind
-0.50
Inſ
-0.50
eſſ
-0.50
myſelf
-0.50
POSITIVE LOGITS
/
1.05
{}/0.89
/
0.88
("/0.81
()/
0.81
:/
0.80
('/0.80
"/
0.79
|/
0.79
'/
0.79
Activations Density 0.008%