INDEX
Explanations
file paths or references to directories in code
relative path navigation
New Auto-Interp
Negative Logits
criminator
-0.48
achat
-0.46
qli
-0.46
straint
-0.45
adget
-0.45
dieß
-0.44
riwal
-0.44
polig
-0.43
ſche
-0.41
pegat
-0.41
POSITIVE LOGITS
'../
2.11
"../
1.89
'../../
1.66
"../../
1.52
('../1.52
'./../
1.50
'../../../
1.46
("../1.45
"../
1.34
"../../../
1.29
Activations Density 0.001%