INDEX
    Explanations

    white in different languages

    New Auto-Interp
    Negative Logits
    DARK
    0.70
    Dark
    0.68
     dark
    0.66
     Dark
    0.64
     DARK
    0.61
    blackberry
    0.61
    Purple
    0.59
     darker
    0.59
    dark
    0.57
     шокола
    0.55
    POSITIVE LOGITS
     white
    2.16
    white
    2.00
    1.89
    White
    1.87
     White
    1.87
     WHITE
    1.87
     белый
    1.82
    WHITE
    1.80
     सफेद
    1.77
    白色
    1.77
    Act Density 0.131%

    No Known Activations