INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Jefus
    -0.49
    Hentet
    -0.43
    DrawerToggle
    -0.41
     típico
    -0.40
    Citiți
    -0.40
    LabelTagHelper
    -0.40
    PositiveButton
    -0.40
     Inscrivez
    -0.40
     typique
    -0.37
    lapsingToolbar
    -0.37
    POSITIVE LOGITS
     يتيمه
    0.66
    RegressionTest
    0.57
    
    0.51
    Зноскі
    0.45
    licability
    0.45
    scrapy
    0.44
    Personendaten
    0.44
     isComment
    0.44
    קישורים
    0.43
     szcz
    0.43
    Act Density 0.726%

    No Known Activations