INDEX
    Explanations

    terms related to success and failure, particularly regarding performance metrics

    New Auto-Interp
    Negative Logits
    hoot
    -0.22
    hire
    -0.20
    iac
    -0.19
    hall
    -0.18
    ei
    -0.17
    sed
    -0.17
    erre
    -0.16
    hp
    -0.16
    hand
    -0.16
    hydro
    -0.16
    POSITIVE LOGITS
    TING
    0.33
    achi
    0.32
    ting
    0.28
    ACHI
    0.27
    ler
    0.27
     parade
    0.24
    omi
    0.22
    lers
    0.22
    maker
    0.21
    REC
    0.21
    Act Density 0.016%

    No Known Activations