INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -2.56
     they
    -2.53
     and
    -2.45
     The
    -2.39
    -2.36
     Familienname
    -2.22
    -2.22
    ↵↵
    -2.19
     besonders
    -2.17
     ‘
    -2.13
    POSITIVE LOGITS
    .
    3.91
    2
    3.27
    ية
    3.02
    5
    2.67
    N
    2.67
    empêcher
    2.58
    L
    2.53
    วัติ
    2.53
     صغير
    2.52
     autres
    2.50
    Act Density 0.003%

    No Known Activations