INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     certified
    -0.07
    _px
    -0.07
    OneToMany
    -0.06
    prefer
    -0.06
    .fd
    -0.06
     whenever
    -0.06
     Licensing
    -0.06
    _dict
    -0.06
    Lastly
    -0.06
     مایل
    -0.06
    POSITIVE LOGITS
    0.07
     Spectrum
    0.07
    awi
    0.07
    يه
    0.06
     troub
    0.06
    0.06
     mong
    0.06
    0.06
     amet
    0.06
     scour
    0.06
    Act Density 0.079%

    No Known Activations