INDEX
    Explanations

    phrases indicating a comparison or contrast

    phrases beginning with "by," indicating attribution or causation

    New Auto-Interp
    Negative Logits
    ounter
    -0.65
    ptions
    -0.64
     digs
    -0.63
    istan
    -0.62
     ILCS
    -0.61
    earable
    -0.59
    ptive
    -0.58
    imal
    -0.58
    stadt
    -0.54
    ylum
    -0.53
    POSITIVE LOGITS
    products
    1.28
     virtue
    1.16
    akuya
    1.07
    laws
    1.03
    product
    1.02
     implication
    1.01
    catch
    0.96
    gone
    0.96
     extension
    0.91
     default
    0.86
    Act Density 0.083%

    No Known Activations