INDEX
    Explanations

    terms or phrases related to comparison or evaluation

    phrases that emphasize comparisons or evaluations in terms of measurable criteria

    New Auto-Interp
    Negative Logits
    resent
    -0.77
    oaded
    -0.76
    estern
    -0.76
    dinand
    -0.72
    ****************
    -0.70
     tatt
    -0.69
    avorite
    -0.67
    enegger
    -0.65
    oster
    -0.64
    oute
    -0.64
    POSITIVE LOGITS
    pace
    0.81
    ames
    0.81
    pring
    0.79
     terms
    0.78
    uman
    0.77
    cale
    0.76
    cape
    0.76
    terms
    0.73
     parity
    0.71
    forth
    0.70
    Act Density 0.025%

    No Known Activations