INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     استنادى
    -0.46
     penn
    -0.44
     Kraw
    -0.44
     Lend
    -0.44
     Новости
    -0.44
     far
    -0.43
     tričko
    -0.42
    ipedi
    -0.42
    .*")]
    -0.41
     Persson
    -0.41
    POSITIVE LOGITS
     desertcart
    1.00
     nahilalakip
    0.93
    aarrggbb
    0.92
     ModelExpression
    0.82
    desertcart
    0.73
     ddelweddau
    0.71
    IndentedString
    0.68
    IUrlHelper
    0.68
    adpleegd
    0.67
    ),),
    0.64
    Act Density 0.033%

    No Known Activations