INDEX
    Explanations

    phrases related to comparison or addition

    repeated phrases and expressions indicating similarity or comparison

    New Auto-Interp
    Negative Logits
     Versus
    -0.73
     newsp
    -0.65
    bp
    -0.64
     '[
    -0.61
    Hy
    -0.61
     Maintenance
    -0.61
     thous
    -0.61
     gorilla
    -0.60
    rarily
    -0.60
     Mansion
    -0.59
    POSITIVE LOGITS
     sidx
    0.85
    vous
    0.75
    otton
    0.74
     includ
    0.72
    aturated
    0.71
    chev
    0.70
    liga
    0.70
    este
    0.70
     besides
    0.69
    amples
    0.68
    Act Density 0.079%

    No Known Activations