INDEX
    Explanations

    phrases indicating a choice or possibility

    phrases indicating conditionality and potentiality

    New Auto-Interp
    Negative Logits
    ratulations
    -0.70
    itiz
    -0.70
     congratulations
    -0.63
    athing
    -0.61
    ãĥĥãĥĪ
    -0.59
    orer
    -0.59
    phabet
    -0.58
    eur
    -0.58
     Gone
    -0.56
    itary
    -0.56
    POSITIVE LOGITS
     expense
    0.99
     moment
    0.97
     glance
    0.90
     rate
    0.86
    mom
    0.83
     behest
    0.82
     point
    0.82
     cost
    0.82
    point
    0.81
    cost
    0.80
    Act Density 0.033%

    No Known Activations