INDEX
    Explanations

    numbers in code snippets

    New Auto-Interp
    Negative Logits
     upraw
    -0.69
     Fahrt
    -0.69
    estu
    -0.67
    Property
    -0.67
    اض
    -0.66
     discredit
    -0.66
     schlagen
    -0.65
    -0.65
    uppa
    -0.65
    zzo
    -0.64
    POSITIVE LOGITS
     пункт
    0.79
    btnClose
    0.73
     変
    0.69
     onError
    0.66
    detal
    0.66
     اخبار
    0.66
     пожар
    0.66
     Sterling
    0.65
    spunkt
    0.65
     }))
    0.64
    Act Density 0.077%

    No Known Activations