INDEX
Explanations
punctuation and special characters
New Auto-Interp
Negative Logits
bose
-0.16
ocket
-0.15
iscard
-0.15
sandwich
-0.14
Independ
-0.14
eton
-0.14
lav
-0.14
achat
-0.14
ilters
-0.13
valuate
-0.13
POSITIVE LOGITS
âĨij
0.26
âĨij
0.23
^
0.21
^
0.20
^↵
0.20
Ret
0.20
Wik
0.19
Template
0.18
Template
0.17
.^
0.17
Activations Density 0.013%