INDEX
    Explanations

    phrases related to decision-making and personal experiences

    New Auto-Interp
    Negative Logits
    currently
    -0.78
     presently
    -0.77
    Dialogue
    -0.70
     now
    -0.69
     anymore
    -0.68
    now
    -0.67
     currently
    -0.66
    ethy
    -0.65
    arta
    -0.65
    ulum
    -0.64
    POSITIVE LOGITS
     originally
    1.18
     yesterday
    1.05
     last
    1.03
     earlier
    1.02
     previously
    0.98
     initially
    0.92
    hes
    0.91
     recently
    0.87
    wolves
    0.87
     unsuccessful
    0.82
    Act Density 2.929%

    No Known Activations