INDEX
    Explanations

    names of individuals

    repeated mentions of specific names and unique identifiers in the text

    New Auto-Interp
    Negative Logits
    istics
    -0.86
    parts
    -0.77
    ARD
    -0.70
    à¦
    -0.70
    ãģį
    -0.69
     fry
    -0.68
    ariat
    -0.67
     WATCHED
    -0.65
     Reincarn
    -0.64
     membr
    -0.63
    POSITIVE LOGITS
     Jed
    1.20
    seys
    1.02
    ouble
    0.80
    hua
    0.80
    arkin
    0.79
    lik
    0.79
    rus
    0.76
    ko
    0.75
    warm
    0.74
    ealous
    0.74
    Act Density 0.015%

    No Known Activations