INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Clik
    -0.66
     متعلقه
    -0.63
    -0.52
    Boxing
    -0.50
    Fang
    -0.50
     createState
    -0.50
    Azi
    -0.50
    topf
    -0.49
     salami
    -0.48
    ReusableCell
    -0.48
    POSITIVE LOGITS
     Nevertheless
    1.80
    Nevertheless
    1.73
     nevertheless
    1.73
    theless
    1.61
     nonetheless
    1.28
    Nonetheless
    1.27
     Nonetheless
    1.27
    anmoins
    1.22
     néanmoins
    1.20
    etheless
    1.15
    Act Density 0.008%

    No Known Activations