INDEX
Explanations
lyrics from songs
instances of special characters or symbols
New Auto-Interp
Negative Logits
Mous
-0.85
Þ
-0.83
tremend
-0.82
anwhile
-0.82
princ
-0.81
eleph
-0.81
scrambling
-0.77
ASC
-0.77
sacrific
-0.75
conduc
-0.75
POSITIVE LOGITS
ľ
1.72
ł
1.25
Ķ
1.21
Ŀ
1.20
ļ
1.12
Ł
1.08
ĺ
1.07
IJ
1.04
¤
1.03
¼
1.02
Activations Density 0.204%