INDEX
    Explanations

    phrases indicating a clear statement or declaration

    phrases emphasizing clarity or making something explicit

    New Auto-Interp
    Negative Logits
    ickle
    -0.65
    Luck
    -0.65
     luck
    -0.64
    gins
    -0.62
     Derby
    -0.62
    ools
    -0.61
    apter
    -0.59
     Variant
    -0.59
    lymp
    -0.59
    otin
    -0.58
    POSITIVE LOGITS
     explicitly
    0.84
     unamb
    0.83
     upfront
    0.83
     unequivocally
    0.82
     emphatically
    0.81
    ances
    0.78
     commitments
    0.77
     unequiv
    0.77
     plainly
    0.75
     distinction
    0.74
    Act Density 0.137%

    No Known Activations