INDEX
    Explanations

    phrases related to revealing information, especially plot twists and spoilers

    phrases related to habits and routines

    New Auto-Interp
    Negative Logits
    ayne
    -0.69
    hran
    -0.69
    apore
    -0.69
    lez
    -0.69
    english
    -0.66
    >>
    -0.66
     tonight
    -0.64
    greg
    -0.64
    çīĪ
    -0.63
    chell
    -0.61
    POSITIVE LOGITS
     underdog
    0.97
     unve
    0.87
     oneself
    0.85
     unexpected
    0.85
     stumble
    0.85
     headline
    0.78
     hastily
    0.77
     triumph
    0.76
     unexpectedly
    0.75
     suddenly
    0.73
    Act Density 1.222%

    No Known Activations