INDEX
    Explanations

    phrases related to progress or development over time

    phrases indicating the status or condition of various situations

    New Auto-Interp
    Negative Logits
    lication
    -0.79
    lement
    -0.79
     Cosponsors
    -0.74
    lees
    -0.73
    onga
    -0.71
    pora
    -0.70
    hardt
    -0.67
    obook
    -0.67
    ordes
    -0.67
    roll
    -0.66
    POSITIVE LOGITS
     happening
    0.89
     happ
    0.86
     happen
    0.79
     cov
    0.76
     transpired
    0.76
     downhill
    0.73
     wrong
    0.68
     unfolded
    0.68
     undone
    0.68
     spir
    0.66
    Act Density 0.222%

    No Known Activations