INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    pants
    -0.08
    powered
    -0.07
     crawled
    -0.07
     законодав
    -0.07
    .PerformLayout
    -0.07
    \Common
    -0.06
     discourse
    -0.06
    hospital
    -0.06
    Directory
    -0.06
     visto
    -0.06
    POSITIVE LOGITS
     محل
    0.06
    ляв
    0.06
     görev
    0.06
     Sutton
    0.06
     thép
    0.06
    reur
    0.06
    (exec
    0.06
    (bytes
    0.06
    ek
    0.05
    "type
    0.05
    Act Density 0.013%

    No Known Activations