INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ModelExpression
    -0.67
    onViewCreated
    -0.67
    rouse
    -0.59
    ########.
    -0.54
    TagMode
    -0.52
     Téléchargez
    -0.52
     mukana
    -0.52
    érité
    -0.51
    apunov
    -0.50
    лежа
    -0.50
    POSITIVE LOGITS
    1.08
     is
    1.06
    '
    0.96
     lenker
    0.65
    `
    0.59
    ̵
    0.57
     continues
    0.54
     remains
    0.54
    imageshack
    0.53
    ´
    0.53
    Act Density 0.004%

    No Known Activations