INDEX
    Explanations

    phrases related to facing challenges or negative outcomes

    phrases indicating significant actions or changes

    New Auto-Interp
    Negative Logits
    ndra
    -0.76
    orth
    -0.75
    bably
    -0.71
    osity
    -0.70
    -+-+
    -0.70
    ulty
    -0.70
    arth
    -0.67
    anni
    -0.67
    legates
    -0.66
    apo
    -0.66
    POSITIVE LOGITS
     seriously
    0.98
     lightly
    0.87
     aback
    0.80
     virginity
    0.80
     hostage
    0.78
     Seriously
    0.77
     stride
    0.77
     plunge
    0.76
     cue
    0.76
     reins
    0.75
    Act Density 0.255%

    No Known Activations