INDEX
    Explanations

    punctuations, particularly commas and quotation marks, indicating conversational flow or emphasis

    New Auto-Interp
    Negative Logits
     â
    -0.17
    kyt
    -0.16
     '..
    -0.16
     “â̦
    -0.16
    наÑĩ
    -0.15
    -0.14
    BorderStyle
    -0.13
    edy
    -0.13
    etine
    -0.13
    den
    -0.13
    POSITIVE LOGITS
    ÂĿ
    0.39
    ãĢģ“
    0.22
    |"
    0.21
    -"
    0.20
    ("
    0.18
    /"
    0.17
    ="
    0.17
    vation
    0.16
    __
    0.16
     "
    0.16
    Act Density 0.181%

    No Known Activations