INDEX
    Explanations

    phrases expressing personal reflections and experiences

    New Auto-Interp
    Negative Logits
    andex
    -0.15
    ayet
    -0.15
    isy
    -0.14
    ilestone
    -0.14
    wan
    -0.14
    egot
    -0.14
    Ĵ
    -0.14
    ield
    -0.14
    thouse
    -0.14
    avanaugh
    -0.13
    POSITIVE LOGITS
     myself
    0.54
     mine
    0.51
     personally
    0.48
    Personally
    0.42
     Personally
    0.40
    mine
    0.37
     ours
    0.35
    Mine
    0.34
     my
    0.34
     saya
    0.33
    Act Density 0.371%

    No Known Activations