INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .filters
    -0.07
     pocházet
    -0.07
     موبایل
    -0.06
    ERCHANT
    -0.06
    	al
    -0.06
     telefono
    -0.06
     těch
    -0.06
     هنوز
    -0.06
     userInfo
    -0.06
    asıyla
    -0.06
    POSITIVE LOGITS
     functor
    0.11
     Functor
    0.09
    ctors
    0.09
     doing
    0.08
    uly
    0.07
    0.07
     goof
    0.07
    _take
    0.07
     Ventura
    0.07
     sober
    0.07
    Act Density 0.001%

    No Known Activations