INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    workers
    -0.07
    proof
    -0.07
     Located
    -0.06
     reduced
    -0.06
    #@
    -0.06
     Tud
    -0.06
    .Highlight
    -0.06
    ادات
    -0.06
    FolderPath
    -0.06
    -dropdown
    -0.06
    POSITIVE LOGITS
     arrogant
    0.07
     adb
    0.06
     інт
    0.06
    (el
    0.06
     lofty
    0.06
     arrogance
    0.06
     instal
    0.06
    0.06
    ros
    0.06
    leich
    0.06
    Act Density 0.022%

    No Known Activations