INDEX
    Explanations

    affirmative responses indicating agreement or acknowledgment

    New Auto-Interp
    Negative Logits
    mis
    -0.59
    rif
    -0.58
    estor
    -0.58
    ://
    -0.57
    larak
    -0.57
     ny
    -0.57
    :\/\/
    -0.56
     Mü
    -0.56
    sive
    -0.55
     Berk
    -0.55
    POSITIVE LOGITS
     YEAH
    1.90
     Yeah
    1.87
    Yeah
    1.84
     yeah
    1.83
    yeah
    1.71
    YEAH
    1.67
     Yep
    1.40
    Yep
    1.33
     Yea
    1.32
     yep
    1.28
    Act Density 0.064%

    No Known Activations