INDEX
    Explanations

    mentions of the color orange

    references to the color orange and aluminum

    New Auto-Interp
    Negative Logits
    risome
    -1.13
    fare
    -0.93
    neys
    -0.86
    awar
    -0.85
    tsky
    -0.84
    liness
    -0.83
    ringe
    -0.82
    friend
    -0.76
    rol
    -0.73
    nam
    -0.71
    POSITIVE LOGITS
     flats
    0.73
     slic
    0.67
     cans
    0.66
     foil
    0.66
     dotted
    0.63
     toast
    0.63
     hops
    0.63
     skirts
    0.61
     peel
    0.60
     simultane
    0.59
    Act Density 0.075%

    No Known Activations