INDEX
    Explanations

    proper nouns of TV shows and universities

    punctuation marks and specific character formatting in written content

    New Auto-Interp
    Negative Logits
    bro
    -0.75
    onian
    -0.74
     whis
    -0.73
    itaire
    -0.71
    eb
    -0.71
    cube
    -0.70
    kefeller
    -0.70
    woods
    -0.70
    runners
    -0.69
    cies
    -0.68
    POSITIVE LOGITS
     âĢ
    1.90
     «
    1.33
     **
    1.31
     *
    1.28
     ¶
    1.24
     âĶ
    1.22
     â
    1.19
     ***
    1.19
     ãĢĮ
    1.18
     ®
    1.16
    Act Density 0.231%

    No Known Activations