INDEX
    Explanations

    phrases related to evaluation or comparison

    phrases indicating subjective opinions or beliefs

    New Auto-Interp
    Negative Logits
    Reviewer
    -0.90
     Beautiful
    -0.64
     cz
    -0.63
     superb
    -0.62
     unim
    -0.61
     undefeated
    -0.60
     wonder
    -0.60
     irresistible
    -0.59
    ellen
    -0.59
    uries
    -0.59
    POSITIVE LOGITS
     referring
    1.23
     refers
    1.03
     referencing
    1.03
     meant
    1.03
     merely
    0.97
     signify
    0.91
     mean
    0.87
     implying
    0.85
    just
    0.83
     refer
    0.83
    Act Density 0.819%

    No Known Activations