INDEX
    Explanations

    mentions of the name "Winfrey."

    New Auto-Interp
    Negative Logits
    ea
    -0.19
    een
    -0.18
    duct
    -0.18
    eed
    -0.18
    ee
    -0.18
    eer
    -0.17
    aler
    -0.16
    ed
    -0.16
    venge
    -0.16
    acks
    -0.15
    POSITIVE LOGITS
    -win
    0.26
    throp
    0.24
    ning
    0.24
    /win
    0.23
    ona
    0.23
    ners
    0.23
    eries
    0.22
    try
    0.21
    ograd
    0.20
    nable
    0.20
    Act Density 0.018%

    No Known Activations