INDEX
    Explanations

    difficult decisions

    New Auto-Interp
    Negative Logits
    vae
    -0.82
    english
    -0.75
    bley
    -0.72
    icas
    -0.72
    icone
    -0.71
    zona
    -0.71
    ogene
    -0.71
    eco
    -0.70
    amen
    -0.70
    licensed
    -0.70
    POSITIVE LOGITS
     decisions
    1.07
     makers
    1.07
     maker
    0.94
     ACTIONS
    0.93
     decision
    0.91
     choices
    0.80
     regarding
    0.77
     calculus
    0.76
     whether
    0.76
     wisely
    0.75
    Act Density 12.867%

    No Known Activations