INDEX
    Explanations

    names of famous individuals

    mentions of specific names or identifiers, particularly related to individuals and their actions

    New Auto-Interp
    Negative Logits
    ł
    -0.77
    cember
    -0.76
     Fighter
    -0.76
    ĺħ
    -0.75
    ²¾
    -0.73
     Pole
    -0.72
    awei
    -0.69
    ļéĨĴ
    -0.69
    à¼
    -0.69
     Pug
    -0.66
    POSITIVE LOGITS
    sis
    1.00
    autions
    0.89
    ement
    0.86
    ading
    0.84
    ilitating
    0.77
    rien
    0.73
    aged
    0.73
    iliated
    0.72
     Trou
    0.72
    bles
    0.72
    Act Density 0.022%

    No Known Activations