INDEX
    Explanations

    references to the color white or its variations in context

    New Auto-Interp
    Negative Logits
    adesh
    -0.16
    agger
    -0.15
     Ñĥв
    -0.15
    amma
    -0.15
     purple
    -0.14
    yny
    -0.14
    mann
    -0.14
    uby
    -0.14
    IMP
    -0.14
    emap
    -0.14
    POSITIVE LOGITS
    WHITE
    0.26
    -white
    0.25
     white
    0.25
     WHITE
    0.24
    White
    0.24
     White
    0.23
    white
    0.22
     çϽ
    0.21
    .White
    0.20
     سÙģÛĮد
    0.20
    Act Density 0.072%

    No Known Activations