INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    p
    1.14
    pire
    0.94
    gr
    0.93
    pad
    0.92
    pw
    0.90
    s
    0.89
    su
    0.88
    k
    0.88
    st
    0.86
    paces
    0.85
    POSITIVE LOGITS
     anderer
    1.21
     vijf
    1.20
     başka
    1.17
     hermana
    1.16
     folhas
    1.13
     particulières
    1.13
     echter
    1.12
     belangrijkste
    1.12
     пять
    1.11
     kleinere
    1.09
    Act Density 0.000%

    No Known Activations