INDEX
    Explanations

    adjectival phrases that describe concepts or characteristics in specific contexts

    New Auto-Interp
    Negative Logits
    est
    -0.18
    123
    -0.17
    518
    -0.16
    847
    -0.15
    INGER
    -0.15
    lo
    -0.15
    inger
    -0.15
    stime
    -0.14
    cks
    -0.14
    ably
    -0.14
    POSITIVE LOGITS
    slaught
    0.16
    /math
    0.16
    ếu
    0.16
    aday
    0.15
    assin
    0.15
    riel
    0.15
    /ge
    0.15
    agged
    0.14
     NTN
    0.14
    partment
    0.14
    Act Density 0.067%

    No Known Activations