INDEX
    Explanations

    the word "myself"

    references to self-identity and personal experiences

    New Auto-Interp
    Negative Logits
    ories
    -0.79
    ulton
    -0.78
    olid
    -0.77
    cemic
    -0.71
    heny
    -0.70
    grade
    -0.70
    orie
    -0.67
    ibaba
    -0.65
     Sierra
    -0.62
    */(
    -0.62
    POSITIVE LOGITS
    selves
    1.02
     myself
    0.99
     personally
    0.94
    self
    0.92
     tremend
    0.90
     enthusi
    0.85
    imei
    0.80
     ashamed
    0.77
     honoured
    0.76
     instinct
    0.75
    Act Density 0.010%

    No Known Activations