INDEX
    Explanations

    proper nouns and corresponding punctuation marks as they appear at the end of sentences

    New Auto-Interp
    Negative Logits
    ¥µ
    -0.76
    acly
    -0.72
     ("
    -0.71
    ahime
    -0.68
    negie
    -0.66
    roximately
    -0.61
    —"
    -0.60
     "
    -0.58
    avored
    -0.58
    rely
    -0.58
    POSITIVE LOGITS
     '.
    2.18
    ,'
    2.13
     ',
    2.10
    .'
    2.09
    ?'
    2.08
    ','
    1.99
    ';
    1.96
     '[
    1.92
    ',
    1.91
    '.
    1.90
    Act Density 0.216%

    No Known Activations