INDEX
Explanations
punctuation marks and their contextual significance in the text
New Auto-Interp
Negative Logits
abus
-0.16
ither
-0.14
aya
-0.14
monic
-0.14
Wyn
-0.13
nome
-0.13
algo
-0.13
einmal
-0.13
emb
-0.13
var
-0.13
POSITIVE LOGITS
etch
0.15
éry
0.15
tu
0.14
urs
0.14
ÑĪкÑĥ
0.14
lab
0.14
uckle
0.13
ETCH
0.13
tte
0.13
weeted
0.13
Activations Density 0.285%