INDEX
Explanations
various symbols and punctuation marks in the text, particularly periods and colons
New Auto-Interp
Negative Logits
myſelf
-0.54
houſe
-0.46
रीदारी
-0.46
themselves
-0.44
yourself
-0.43
šanās
-0.43
bouch
-0.43
moi
-0.42
ſte
-0.41
WhereInput
-0.41
POSITIVE LOGITS
']").
0.46
bri
0.43
evil
0.42
[]:
0.41
schoenen
0.41
sha
0.41
הערות
0.41
))->
0.41
liné
0.40
ped
0.40
Activations Density 0.181%