INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    itat
    -0.07
    jm
    -0.07
     trapped
    -0.06
    Va
    -0.06
     نفت
    -0.06
     سكان
    -0.06
    attack
    -0.06
    .func
    -0.06
    _OT
    -0.06
    rikes
    -0.06
    POSITIVE LOGITS
     Rece
    0.06
    0.06
     Machinery
    0.06
    0.06
     Hed
    0.06
     Ans
    0.06
     overturned
    0.06
     به
    0.06
    ΡΓ
    0.06
    .hstack
    0.06
    Act Density 0.000%

    No Known Activations