INDEX
    Explanations

    phrases indicating a change or development in a positive direction

    New Auto-Interp
    Negative Logits
    Joined
    -0.74
    odied
    -0.74
     successors
    -0.69
    otype
    -0.68
     predecessors
    -0.68
    DNA
    -0.67
    Introduced
    -0.65
    issance
    -0.65
    inia
    -0.64
    lication
    -0.62
    POSITIVE LOGITS
     downhill
    0.90
     spir
    0.79
     BELOW
    0.77
     blurry
    0.73
     unfolded
    0.73
     murky
    0.72
     bleak
    0.70
     calmed
    0.70
     fluid
    0.70
     Thrones
    0.70
    Act Density 1.657%

    No Known Activations