INDEX
    Explanations

    phrases that indicate a rephrasing or simplification of information

    New Auto-Interp
    Negative Logits
    straints
    -0.17
    erable
    -0.15
    zan
    -0.15
    ÙħØŃ
    -0.15
    adius
    -0.14
    straint
    -0.14
    tring
    -0.14
    ogue
    -0.14
     Telescope
    -0.14
    anou
    -0.14
    POSITIVE LOGITS
    xi
    0.15
    âĸį
    0.15
     ph
    0.15
    arb
    0.15
    appers
    0.14
    isco
    0.14
    лÑİд
    0.14
    MDB
    0.14
    phrase
    0.14
    ropa
    0.14
    Act Density 0.167%

    No Known Activations