INDEX
Explanations
punctuation marks, particularly commas and periods
New Auto-Interp
Negative Logits
oret
-0.15
cf
-0.15
seed
-0.15
-fw
-0.14
,“
-0.14
»¿
-0.14
rie
-0.14
ringe
-0.13
alternate
-0.13
irs
-0.13
POSITIVE LOGITS
tongue
0.19
ToPoint
0.18
quote
0.16
quote
0.15
quier
0.15
agit
0.15
ideo
0.14
ê¸ī
0.14
somewhat
0.14
aed
0.14
Activations Density 0.085%