INDEX
    Explanations

    mentions of the word "Bir" with varying activation values

    words related to the term "biracial."

    New Auto-Interp
    Negative Logits
     nomine
    -0.67
    eur
    -0.67
    ĵĺ
    -0.63
    EMBER
    -0.61
     Tune
    -0.60
     paternal
    -0.59
     tune
    -0.58
     wise
    -0.58
    Unit
    -0.58
     plat
    -0.57
    POSITIVE LOGITS
    mingham
    1.28
    thing
    1.08
    git
    1.05
    ging
    0.94
    ney
    0.92
    chell
    0.91
    keley
    0.91
    combe
    0.87
    gel
    0.87
    ulia
    0.86
    Act Density 0.021%

    No Known Activations