INDEX
Explanations
punctuation marks indicating the end of sentences
New Auto-Interp
Negative Logits
finger
-0.17
Finger
-0.17
pine
-0.15
finger
-0.15
fingers
-0.15
ÄĽn
-0.14
cki
-0.14
rol
-0.14
plib
-0.14
ErrorMsg
-0.14
POSITIVE LOGITS
æķĪ
0.16
AAD
0.16
ĵn
0.15
quier
0.15
eff
0.14
adio
0.14
tors
0.14
@}
0.14
ipro
0.14
opro
0.13
Activations Density 0.000%