INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     hoa
    -0.07
     allerg
    -0.07
     Discounts
    -0.07
    ukes
    -0.06
    Compet
    -0.06
    .sg
    -0.06
    ptom
    -0.06
    ха
    -0.06
    -Feb
    -0.06
     لكن
    -0.06
    POSITIVE LOGITS
    useState
    0.06
     est
    0.06
    (['
    0.06
     Rebel
    0.06
    _constructor
    0.06
     ל
    0.06
                         
    0.06
    0.06
    ...
    ↵
    0.06
    '},↵
    0.06
    Act Density 0.029%

    No Known Activations