INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    %S
    -0.07
    ør
    -0.07
    -0.07
     ASF
    -0.07
     Refriger
    -0.07
     thrift
    -0.07
     fracture
    -0.06
     bruk
    -0.06
    -0.06
    -reader
    -0.06
    POSITIVE LOGITS
    остоя
    0.08
    _linear
    0.07
    enderit
    0.07
    جتماعية
    0.07
    0.06
    0.06
    eea
    0.06
     الحكومية
    0.06
    $('
    0.06
     المجتمع
    0.06
    Act Density 0.002%

    No Known Activations