INDEX
    Explanations

    personal pronouns followed by verbs or adjectives describing actions or attributes

    New Auto-Interp
    Negative Logits
    dp
    -0.67
    Ĥ¬
    -0.67
    dding
    -0.64
    Reply
    -0.63
    icial
    -0.61
    inton
    -0.59
     GCC
    -0.59
    essions
    -0.58
    orses
    -0.57
    612
    -0.57
    POSITIVE LOGITS
     excel
    0.78
     thri
    0.77
     humble
    0.72
     behaves
    0.69
     morph
    0.68
     popularity
    0.68
     thrive
    0.68
     creators
    0.68
     ubiqu
    0.66
     existed
    0.66
    Act Density 0.802%

    No Known Activations