INDEX
    Explanations

    conditional statements or hypothetical scenarios using the word "if"

    New Auto-Interp
    Negative Logits
    pour
    -0.86
    ahime
    -0.81
    oult
    -0.81
    olis
    -0.76
    ggles
    -0.75
    ossom
    -0.73
    berus
    -0.73
    uct
    -0.73
    iband
    -0.69
    ricks
    -0.69
    POSITIVE LOGITS
     they
    0.92
     unwittingly
    0.79
     unintentionally
    0.78
     outnumbered
    0.77
     THEY
    0.77
     it
    0.77
     warranted
    0.76
    soever
    0.75
     inadvertently
    0.74
     technically
    0.74
    Act Density 0.081%

    No Known Activations