INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    :{
    -0.07
    .ByteString
    -0.06
    159
    -0.06
    ault
    -0.06
    _UPPER
    -0.06
     بده
    -0.06
    مود
    -0.06
     exhibit
    -0.06
    _Con
    -0.06
    uild
    -0.06
    POSITIVE LOGITS
    getFullYear
    0.07
     abl
    0.06
    (ib
    0.06
     میتوان
    0.06
     caramel
    0.06
    llum
    0.06
     {\
    0.06
     placement
    0.06
    decision
    0.06
    "/>↵↵
    0.06
    Act Density 0.003%

    No Known Activations