INDEX
Explanations
punctuation marks and sentence structure
New Auto-Interp
Negative Logits
oidal
-0.16
ì§ģ
-0.14
Wik
-0.14
oup
-0.14
surprise
-0.14
çģ
-0.13
Bash
-0.13
ugu
-0.13
Verd
-0.13
umper
-0.13
POSITIVE LOGITS
bottom
0.21
Bottom
0.20
Bottom
0.20
bottom
0.18
Hopefully
0.18
Interested
0.17
BOTTOM
0.17
BOTTOM
0.16
hopefully
0.16
Overall
0.16
Activations Density 0.111%