INDEX
    Explanations

    reassuring statements to alleviate worries or fears

    New Auto-Interp
    Negative Logits
    artney
    -0.76
    avor
    -0.73
    ourses
    -0.66
     progressively
    -0.65
    iband
    -0.63
     decom
    -0.60
    urate
    -0.60
    avorite
    -0.60
    arching
    -0.60
    olid
    -0.59
    POSITIVE LOGITS
    !
    1.00
    !:
    0.98
    !]
    0.90
     ladies
    0.88
    !),
    0.87
    !).
    0.87
    !)
    0.83
    !,
    0.81
    !!
    0.80
     folks
    0.80
    Act Density 0.110%

    No Known Activations