INDEX
    Explanations

    instances of decisions being made

    instances of the word "decided" in various contexts

    New Auto-Interp
    Negative Logits
    eries
    -0.64
    anon
    -0.60
    ILA
    -0.59
    iddle
    -0.57
    reen
    -0.56
    oing
    -0.55
    quin
    -0.55
    Lens
    -0.55
    Newsletter
    -0.55
    ighth
    -0.53
    POSITIVE LOGITS
     to
    1.04
     unanimously
    0.86
     against
    0.85
     upon
    0.82
     that
    0.75
     unilaterally
    0.72
     beforehand
    0.69
     not
    0.69
     differently
    0.69
     nevertheless
    0.68
    Act Density 0.068%

    No Known Activations