INDEX
    Explanations

    adjectives relating to color, specifically the color red

    references to the color red

    New Auto-Interp
    Negative Logits
    ernel
    -0.81
    agall
    -0.74
    awaru
    -0.72
    Ö¼
    -0.70
    Reloaded
    -0.68
    vre
    -0.68
    UGH
    -0.67
    XT
    -0.67
     Lank
    -0.65
    ILA
    -0.65
    POSITIVE LOGITS
    efined
    1.23
    iscovered
    1.14
    irection
    1.13
    neck
    1.13
    oubt
    1.12
    iscover
    1.12
    iscovery
    1.11
    rawn
    1.08
    oub
    1.04
     velvet
    1.00
    Act Density 0.026%

    No Known Activations