INDEX
    Explanations

    phrases indicating decisions or choices to be made

    New Auto-Interp
    Negative Logits
    oil
    -0.78
    atha
    -0.77
    acements
    -0.72
     tremend
    -0.70
    athi
    -0.68
    atched
    -0.68
    ikan
    -0.68
    oing
    -0.66
    ffff
    -0.65
    outh
    -0.64
    POSITIVE LOGITS
     whether
    1.00
     decide
    0.75
     decisions
    0.74
     deciding
    0.72
     unanimously
    0.72
     how
    0.70
     differently
    0.69
     decisively
    0.69
     decides
    0.69
     upon
    0.69
    Act Density 0.033%

    No Known Activations