INDEX
Explanations
special characters and symbols in the text
New Auto-Interp
Negative Logits
ked
-0.16
Kathryn
-0.15
opis
-0.15
uya
-0.15
YLON
-0.15
-k
-0.14
oku
-0.14
jez
-0.14
Katz
-0.14
9
-0.14
POSITIVE LOGITS
mp
0.34
nce
0.33
ÈĻi
0.30
mpr
0.30
nger
0.29
nc
0.28
mb
0.28
ns
0.27
nde
0.25
mpl
0.25
Activations Density 0.006%