INDEX
    Explanations

    phrases indicating opinions or evaluations

    quotation marks and related discourse markers

    New Auto-Interp
    Negative Logits
     scares
    -0.76
     Flavoring
    -0.72
     guiActiveUn
    -0.69
     summarizes
    -0.67
     NET
    -0.64
     veil
    -0.64
    çīĪ
    -0.63
     formations
    -0.63
     accompanies
    -0.62
     loopholes
    -0.62
    POSITIVE LOGITS
    absolutely
    1.04
    done
    1.00
    really
    0.99
    ready
    0.98
    likely
    0.97
    completely
    0.96
    extremely
    0.90
    appropriately
    0.89
    fed
    0.89
    still
    0.87
    Act Density 0.221%

    No Known Activations