INDEX
    Explanations

    references to people's names, especially the repeated mention of "Ruth" and "Babe Ruth"

    references to the name "Ruth."

    New Auto-Interp
    Negative Logits
    gotten
    -0.73
    ctica
    -0.67
    agons
    -0.62
    olesc
    -0.62
    artney
    -0.62
    iator
    -0.62
    tein
    -0.61
     Helsinki
    -0.61
    opal
    -0.61
    akening
    -0.60
    POSITIVE LOGITS
    anne
    0.86
     Ruth
    0.85
    lessly
    0.84
    uth
    0.79
    less
    0.77
    lessness
    0.74
    enthal
    0.71
    utherford
    0.71
    mite
    0.70
    anna
    0.68
    Act Density 0.012%

    No Known Activations