INDEX
    Explanations

    expressions of enjoyment or positive experiences

    New Auto-Interp
    Negative Logits
    ine
    -0.67
    Al
    -0.61
     Ber
    -0.60
     Hu
    -0.58
     Kle
    -0.58
     Ra
    -0.58
    netinet
    -0.58
     T
    -0.57
    B
    -0.57
    pmatrix
    -0.57
    POSITIVE LOGITS
    enjoy
    1.88
     Enjoy
    1.85
     enjoy
    1.80
     ENJOY
    1.76
     enjoyed
    1.75
     enjoyment
    1.69
    Enjoying
    1.68
    Enjoy
    1.66
     enjoys
    1.63
     enjoying
    1.60
    Act Density 0.037%

    No Known Activations