INDEX
    Explanations

    instances of the word "race" and its variations

    New Auto-Interp
    Negative Logits
    olia
    -0.76
    iar
    -0.76
    oca
    -0.75
    arial
    -0.73
    tymology
    -0.72
    lishes
    -0.71
    berra
    -0.71
    vironment
    -0.67
    osis
    -0.66
    iated
    -0.65
    POSITIVE LOGITS
    horse
    1.31
    course
    1.29
    cars
    1.05
    bike
    1.04
    car
    0.98
    nell
    0.84
    runners
    0.83
    runner
    0.83
     bikes
    0.82
    goers
    0.79
    Act Density 0.018%

    No Known Activations