INDEX
    Explanations

    instances where the phrase "what the" is followed by a description or question

    occurrences of the word "the"

    New Auto-Interp
    Negative Logits
    alus
    -0.78
    avan
    -0.77
    ignt
    -0.72
    onduct
    -0.70
    thia
    -0.69
    dropping
    -0.68
    nav
    -0.68
    hops
    -0.68
    aunder
    -0.66
    velt
    -0.66
    POSITIVE LOGITS
     heck
    1.66
     hell
    1.57
     fuss
    1.34
     fuck
    1.28
     ramifications
    0.99
     future
    0.96
     implications
    0.94
     consequences
    0.91
     HELL
    0.91
     repercussions
    0.88
    Act Density 0.084%

    No Known Activations