INDEX
    Explanations

    color-related words, specifically variations of the color red

    the occurrences of the word "Red."

    New Auto-Interp
    Negative Logits
    ammad
    -0.78
    SPONSORED
    -0.76
    OHN
    -0.75
    gerald
    -0.71
     4090
    -0.69
    areth
    -0.68
    =-=-=-=-
    -0.65
    =-=-
    -0.64
    Ö¼
    -0.63
     Ellis
    -0.63
    POSITIVE LOGITS
    uces
    1.14
    uced
    1.05
    rawn
    1.03
    uce
    1.03
    ucing
    1.02
    ouble
    1.00
    uctor
    0.98
    cliffe
    0.94
    der
    0.92
    irect
    0.87
    Act Density 0.011%

    No Known Activations