INDEX
    Explanations

    phrases related to weighing advantages and disadvantages

    references to the advantages of a situation or concept, often described as "pros."

    New Auto-Interp
    Negative Logits
    ading
    -0.70
    DER
    -0.69
    ATA
    -0.68
    MORE
    -0.68
    aded
    -0.66
    ashes
    -0.66
    OWS
    -0.63
    owship
    -0.63
    orf
    -0.62
    raped
    -0.62
    POSITIVE LOGITS
     pros
    1.25
    ocial
    1.10
     Pros
    1.00
     outwe
    0.91
    aic
    0.90
    yip
    0.87
    Pros
    0.86
     pse
    0.86
    cephal
    0.84
    daq
    0.83
    Act Density 0.009%

    No Known Activations