INDEX
    Explanations

    the word "anyone" with a relatively high activation value

    the word "anyone"

    the word "anyone" and its variations throughout the text

    New Auto-Interp
    Negative Logits
    pa
    -0.65
    ories
    -0.63
    ritz
    -0.61
     Kitchen
    -0.61
     Congo
    -0.60
     Labor
    -0.60
     Shows
    -0.59
     BDS
    -0.59
     Maze
    -0.59
    neck
    -0.59
    POSITIVE LOGITS
     else
    1.55
    THING
    1.26
    Else
    1.14
     Else
    1.03
    else
    0.98
    soever
    0.97
     imaginable
    0.88
    20439
    0.87
    omever
    0.87
     doubted
    0.86
    Act Density 0.017%

    No Known Activations