INDEX
    Explanations

    conversational phrases and discourse markers in discussions or arguments

    New Auto-Interp
    Negative Logits
    letal
    -0.15
     Definitely
    -0.15
    viar
    -0.14
    iesen
    -0.14
    plier
    -0.14
    SEQUENTIAL
    -0.14
    Latch
    -0.14
    omial
    -0.14
    éĬ
    -0.14
    quip
    -0.14
    POSITIVE LOGITS
     Slo
    0.16
     fe
    0.15
     Fle
    0.15
     fascinating
    0.15
     Sloan
    0.14
     Strom
    0.14
     toggle
    0.14
     Tweets
    0.14
     Mik
    0.14
     chy
    0.14
    Act Density 0.016%

    No Known Activations