INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    اب
    1.90
    1.88
    aient
    1.84
    ik
    1.79
    ie
    1.77
    ى
    1.76
    ת
    1.74
     COLLATION
    1.73
    ش
    1.73
    ك
    1.73
    POSITIVE LOGITS
    2.22
     reasons
    1.91
    1.83
    biotics
    1.79
    1.77
     lẽ
    1.72
     bother
    1.70
    с
    1.66
     buz
    1.63
     Reasons
    1.63
    Act Density 0.617%

    No Known Activations