INDEX
    Explanations

    proper nouns, particularly names of people and places

    New Auto-Interp
    Negative Logits
    ovna
    -0.15
    #aa
    -0.15
    ãģ¤ãģ¶
    -0.14
    %C
    -0.14
    ofday
    -0.13
    êtes
    -0.13
    ازÙĩ
    -0.13
     Ä
    -0.12
    /or
    -0.12
    Ìģt
    -0.12
    POSITIVE LOGITS
     himself
    0.23
    ’s
    0.18
    's
    0.18
     же
    0.16
    -san
    0.16
    çļĦéĹ®é¢ĺ
    0.15
    çļĦä¸Ģ个
    0.15
    —who
    0.15
     stesso
    0.14
     Jr
    0.14
    Act Density 0.219%

    No Known Activations