INDEX
    Explanations

    claiming/saying

    New Auto-Interp
    Negative Logits
    -g
    -0.07
     texte
    -0.07
    adam
    -0.06
    ema
    -0.06
     evangel
    -0.06
    asures
    -0.06
    652
    -0.06
    _tm
    -0.06
    _ang
    -0.06
     ngay
    -0.06
    POSITIVE LOGITS
     yık
    0.07
    _eta
    0.07
     sparing
    0.06
    LET
    0.06
     brutality
    0.06
     betting
    0.06
     psyched
    0.06
     خی
    0.06
     dễ
    0.06
     Covenant
    0.06
    Act Density 0.076%

    No Known Activations