INDEX
    Explanations

    mentions of a specific individual's name in a negative context

    words related to an individual's name or identity

    New Auto-Interp
    Negative Logits
    schild
    -0.78
    space
    -0.76
    enegger
    -0.74
    line
    -0.72
    starter
    -0.69
    sheet
    -0.69
    tal
    -0.69
    birds
    -0.66
    lings
    -0.65
    hawk
    -0.64
    POSITIVE LOGITS
    ñ
    0.94
    edia
    0.90
    pered
    0.89
    cess
    0.88
    uthor
    0.84
    pload
    0.83
    odcast
    0.81
    pa
    0.81
    resa
    0.80
    apa
    0.80
    Act Density 0.011%

    No Known Activations