INDEX
    Explanations

    specific punctuation or formatting symbols, particularly quotation marks

    Punctuation (various kinds) preceding a word

    demographic statistics or evidentiary support

    New Auto-Interp
    Negative Logits
     ».
    -0.87
    ”.
    -0.87
    *.
    -0.85
    ].
    -0.82
     ].
    -0.81
    .
    
    -0.79
     []).
    -0.77
     }.
    -0.77
    }.
    -0.77
    ".
    -0.76
    POSITIVE LOGITS
    Basically
    0.71
    ',"
    0.69
    ,"
    0.66
     Basically
    0.62
    ,'"
    0.61
    Literally
    0.60
    ),"
    0.60
     I
    0.59
    cookieParser
    0.58
    <bos>
    0.58
    Act Density 0.036%

    No Known Activations