INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     corrid
    -0.70
     swamp
    -0.68
     inclusive
    -0.66
     portal
    -0.65
    apan
    -0.65
    oute
    -0.65
    stem
    -0.64
    aper
    -0.64
     docking
    -0.63
     binge
    -0.63
    POSITIVE LOGITS
     Nope
    1.50
     Absolutely
    1.36
     Probably
    1.36
     Possibly
    1.33
     Certainly
    1.32
     Maybe
    1.18
     Yes
    1.18
     Surely
    1.16
     Hmm
    1.16
     Perhaps
    1.14
    Act Density 0.071%

    No Known Activations