INDEX
    Explanations

    the word "not" followed by a high activation word

    the phrase "not only" and its variations

    New Auto-Interp
    Negative Logits
    stakes
    -0.67
     Quarterly
    -0.67
     Budapest
    -0.60
    plate
    -0.58
     Grimoire
    -0.57
    ixel
    -0.57
     Frenzy
    -0.55
    ç·
    -0.55
     Spotlight
    -0.54
     Pulse
    -0.54
    POSITIVE LOGITS
    epad
    1.42
    icably
    1.41
    ched
    1.18
    ices
    1.17
    ches
    1.16
    hin
    1.15
    ifying
    1.15
    ifies
    1.15
    ifications
    1.10
    ifier
    1.07
    Act Density 0.129%

    No Known Activations