INDEX
    Explanations

    phrases that emphasize the importance of accuracy

    New Auto-Interp
    Negative Logits
    aker
    -1.77
    renn
    -1.55
    acker
    -1.54
    itian
    -1.51
    orian
    -1.48
    ori
    -1.48
    urer
    -1.47
    ker
    -1.47
    obacterium
    -1.46
    ussels
    -1.45
    POSITIVE LOGITS
    ŀ
    3.10
    ¥
    3.02
    ģ
    2.99
    ¡
    2.97
    Ļ
    2.88
    «
    2.79
    ĻĤ
    2.77
    Ģ
    2.75
    µ
    2.75
    ļ
    2.75
    Act Density 0.015%

    No Known Activations