INDEX
    Explanations

    punctuation and quotation marks at the end of sentences

    New Auto-Interp
    Negative Logits
     "
    -1.03
     -"
    -0.93
    。"
    -0.86
    -"
    -0.83
    "
    -0.83
     "'
    -0.78
    "...
    -0.78
    ..."
    -0.77
     "...
    -0.77
    "'
    -0.76
    POSITIVE LOGITS
    .”
    2.08
    ”).
    2.02
    ,”
    2.02
    ”,
    2.01
    )”
    2.01
    1.97
    ”)
    1.96
    ”),
    1.94
    ?”
    1.94
    ”.
    1.93
    Act Density 0.165%

    No Known Activations