INDEX
    Explanations

    phrases enclosed in double quotation marks

    quotes and dialogue marks in the text

    New Auto-Interp
    Negative Logits
    !'
    -1.14
    ,'
    -1.11
    ?'
    -1.11
    .'
    -1.09
    ,'"
    -0.94
    ?'"
    -0.89
    .'"
    -0.89
    )'
    -0.88
    !'"
    -0.87
    ãĢį
    -0.84
    POSITIVE LOGITS
     "
    2.42
     "'
    1.99
     "[
    1.95
     "â̦
    1.84
     "...
    1.79
     "#
    1.79
     "(
    1.78
     "-
    1.71
     "$
    1.68
     ".
    1.63
    Act Density 0.140%

    No Known Activations