INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _collect
    -0.07
     Winter
    -0.07
     только
    -0.07
     برخی
    -0.07
     rampant
    -0.06
    	lib
    -0.06
     тільки
    -0.06
    리아
    -0.06
     Lifetime
    -0.06
    Quad
    -0.06
    POSITIVE LOGITS
    0.07
     first
    0.07
     издел
    0.06
    roll
    0.06
     uname
    0.06
     nichts
    0.06
     seriousness
    0.06
    ่าจะ
    0.06
     propertyName
    0.06
     BRE
    0.06
    Act Density 0.015%

    No Known Activations