INDEX
    Explanations

    instances of the word "rarely" followed by a non-zero activation value

    instances of the word "rarely" and its synonyms, indicating infrequency

    New Auto-Interp
    Negative Logits
     Destruction
    -0.72
     Submission
    -0.71
    oÄŁ
    -0.70
    arta
    -0.69
    utenberg
    -0.68
    uers
    -0.67
    jri
    -0.65
    uid
    -0.65
    andi
    -0.65
     Emirates
    -0.64
    POSITIVE LOGITS
    theless
    1.24
    entimes
    1.06
    icably
    0.95
    epad
    0.87
     dime
    0.80
    etheless
    0.79
     bothered
    0.77
     Asked
    0.77
     hesitate
    0.77
    pmwiki
    0.77
    Act Density 0.009%

    No Known Activations