INDEX
    Explanations

    sentences that indicate approval or consent in research contexts

    New Auto-Interp
    Negative Logits
     autorytatywna
    -0.54
     so
    -0.53
     something
    -0.52
     Something
    -0.51
     simply
    -0.51
    Something
    -0.51
     @"/
    -0.50
     qualcosa
    -0.49
     quite
    -0.48
    hea
    -0.47
    POSITIVE LOGITS
    CloseOperation
    0.82
    qrstuvwxyz
    0.75
    ſelves
    0.75
    aarrggbb
    0.74
     صوتيه
    0.68
    ſelf
    0.68
    )');
    0.68
    NOPQRST
    0.66
     themſelves
    0.66
     Kessel
    0.65
    Act Density 0.723%

    No Known Activations