INDEX
    Explanations

    instances of dialogue or conversational exchanges

    New Auto-Interp
    Negative Logits
    ents
    -0.15
    yre
    -0.14
    erner
    -0.14
    hunter
    -0.14
    atÄĥ
    -0.14
     professionnel
    -0.13
    fid
    -0.13
    rede
    -0.13
    st
    -0.13
    iom
    -0.13
    POSITIVE LOGITS
    ounder
    0.16
    ocha
    0.15
     Halk
    0.14
    á»ħ
    0.14
    ìł
    0.13
     Roose
    0.13
    *$
    0.13
     disfr
    0.13
     cấp
    0.13
     ìŀ¬
    0.13
    Act Density 0.247%

    No Known Activations