INDEX
    Explanations

    mentions of specific people's names

    New Auto-Interp
    Negative Logits
     Prelude
    -0.74
    uate
    -0.69
     headache
    -0.69
     enclosed
    -0.65
     succeeding
    -0.63
     duplication
    -0.63
     gratification
    -0.62
     psy
    -0.62
     headaches
    -0.62
     bottleneck
    -0.61
    POSITIVE LOGITS
    ITNESS
    1.35
    OOD
    1.24
    ALK
    1.22
    ITCH
    1.20
    arsh
    1.17
    idespread
    1.17
    orthy
    1.16
    rote
    1.15
    OW
    1.13
    edge
    1.12
    Act Density 0.027%

    No Known Activations