INDEX
    Explanations

    references to boundaries or enclosures, particularly in the context of fences and gates

    New Auto-Interp
    Negative Logits
     baiser
    -0.18
    resco
    -0.16
    rrha
    -0.15
    Ế
    -0.15
     æ©
    -0.15
    âķĿ
    -0.15
    -lnd
    -0.15
    صد
    -0.14
     zbyt
    -0.14
    zh
    -0.14
    POSITIVE LOGITS
     net
    0.17
     collo
    0.15
     mur
    0.15
    Æ°á»Ľng
    0.15
     Dos
    0.15
    ature
    0.15
    560
    0.15
    -door
    0.14
    Dos
    0.14
    gate
    0.14
    Act Density 0.036%

    No Known Activations