INDEX
    Explanations

    actions related to helping or assisting others

    references to frequent actions or common experiences

    New Auto-Interp
    Negative Logits
     Bam
    -0.70
     Showdown
    -0.69
     Variety
    -0.67
     Kush
    -0.65
     Bern
    -0.64
     Kob
    -0.63
     ASAP
    -0.63
     Ys
    -0.62
     PLEASE
    -0.62
     Kitty
    -0.60
    POSITIVE LOGITS
    ttes
    0.89
    depending
    0.83
    oots
    0.78
    pas
    0.72
    times
    0.70
    rist
    0.69
    pmwiki
    0.68
    entimes
    0.68
    ensical
    0.68
    rences
    0.66
    Act Density 0.403%

    No Known Activations