INDEX
    Explanations

    references to the color black and related concepts

    New Auto-Interp
    Negative Logits
    ics
    -0.69
    ison
    -0.43
    als
    -0.39
    hawks
    -0.39
    burn
    -0.39
    agascar
    -0.39
    mund
    -0.38
    win
    -0.38
    ropolitan
    -0.38
    ational
    -0.37
    POSITIVE LOGITS
    ges
    0.17
    abilities
    0.17
    urry
    0.16
    rani
    0.15
     minded
    0.15
    rub
    0.15
    ight
    0.14
    puted
    0.14
    leen
    0.14
     Flake
    0.14
    Act Density 0.054%

    No Known Activations