INDEX
    Explanations

    instances of the word "pretend" in various contexts

    New Auto-Interp
    Negative Logits
    jack
    -0.17
    /lg
    -0.17
    ëŀ
    -0.17
    iÄįka
    -0.16
    aiser
    -0.16
    ActionCreators
    -0.15
    oggler
    -0.15
     âĹĦ
    -0.15
    ÐŁÐļ
    -0.15
    rese
    -0.15
    POSITIVE LOGITS
    pret
    0.16
    779
    0.15
    REP
    0.14
    uous
    0.14
    ceptive
    0.14
    ÑĦи
    0.13
    918
    0.13
    glich
    0.13
    pid
    0.13
    -tip
    0.13
    Act Density 0.013%

    No Known Activations