INDEX
    Explanations

    statements or quotes spoken by someone

    instances of dialogue or statements made by individuals

    New Auto-Interp
    Negative Logits
    ãĥİ
    -0.72
    thur
    -0.65
    ãĥĥãĥĪ
    -0.65
    paralle
    -0.63
    MU
    -0.62
    arent
    -0.62
     pend
    -0.60
    å§«
    -0.59
     ILCS
    -0.59
    resent
    -0.59
    POSITIVE LOGITS
     sarcast
    1.06
     bluntly
    1.02
     rhet
    0.96
     emphatically
    0.89
     referring
    0.83
     afterward
    0.81
     diplom
    0.78
     aloud
    0.78
     incred
    0.77
     proudly
    0.75
    Act Density 0.123%

    No Known Activations