INDEX
    Explanations

    phrases or sentences comparing different entities or concepts

    phrases emphasizing comparative structures or similarities

    New Auto-Interp
    Negative Logits
     umb
    -0.75
     balcon
    -0.68
     escal
    -0.66
     unequ
    -0.64
     bystand
    -0.64
     flares
    -0.63
     extraord
    -0.62
    emer
    -0.62
    ama
    -0.61
    omet
    -0.60
    POSITIVE LOGITS
    sex
    0.72
    ricanes
    0.70
    ounter
    0.67
    dragon
    0.67
    roman
    0.66
    riers
    0.66
    aign
    0.64
    aneously
    0.64
     Rico
    0.64
    same
    0.64
    Act Density 0.045%

    No Known Activations