INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Cp
    -0.07
    Pr
    -0.07
    _list
    -0.06
     Po
    -0.06
     رفته
    -0.06
     playlist
    -0.06
    DECLARE
    -0.06
    	Time
    -0.06
    _smooth
    -0.06
    Od
    -0.06
    POSITIVE LOGITS
    fer
    0.09
    ۱۹۵
    0.07
     перс
    0.07
     finer
    0.07
    вен
    0.07
    ing
    0.06
     waiver
    0.06
     weakening
    0.06
     unter
    0.06
     Refer
    0.06
    Act Density 0.002%

    No Known Activations