INDEX
    Explanations

    phrases containing the conjunction "and" along with references to colors, particularly "black" and "white."

    New Auto-Interp
    Negative Logits
    ettle
    -0.15
    avigate
    -0.15
    sz
    -0.14
     stabbing
    -0.14
    ån
    -0.13
    ναν
    -0.13
    iken
    -0.13
     رب
    -0.13
     Cotton
    -0.13
    oure
    -0.13
    POSITIVE LOGITS
     white
    0.32
     White
    0.28
    white
    0.27
     whites
    0.27
    çϽ
    0.26
    WHITE
    0.24
    White
    0.24
    -white
    0.24
     çϽ
    0.23
     WHITE
    0.23
    Act Density 0.041%

    No Known Activations