INDEX
    Explanations

    questions posed rhetorically for confirmation

    New Auto-Interp
    Negative Logits
    apan
    -0.72
    rament
    -0.68
    shaw
    -0.66
     foreground
    -0.64
     binge
    -0.62
    apers
    -0.61
     lobster
    -0.60
    thro
    -0.60
     slam
    -0.60
     background
    -0.59
    POSITIVE LOGITS
     Nope
    0.96
     Wouldn
    0.91
     Yeah
    0.87
     Anyway
    0.86
     Why
    0.84
     Especially
    0.84
     Isn
    0.84
     Surely
    0.83
     Alright
    0.81
     Maybe
    0.80
    Act Density 0.065%

    No Known Activations