INDEX
    Explanations

    the overall effectiveness or summary results in a given context

    New Auto-Interp
    Negative Logits
    er
    -0.75
    en
    -0.71
    -0.66
    u
    -0.66
    o
    -0.65
    an
    -0.62
    in
    -0.60
    ik
    -0.60
    y
    -0.60
    io
    -0.59
    POSITIVE LOGITS
    OVERALL
    1.94
    overall
    1.87
     overall
    1.83
     Overall
    1.80
    Overall
    1.79
     overal
    1.32
     overalls
    1.17
     itſelf
    1.16
     Insgesamt
    1.11
     geral
    1.10
    Act Density 0.093%

    No Known Activations