INDEX
    Explanations

    negative constructions and prohibitions in language

    New Auto-Interp
    Negative Logits
     SPDX
    -0.16
    ko
    -0.15
     stru
    -0.14
    vrd
    -0.14
    CAD
    -0.14
    zig
    -0.14
    -Za
    -0.14
    enever
    -0.14
    Enumerator
    -0.13
     Vill
    -0.13
    POSITIVE LOGITS
    uchi
    0.17
    oran
    0.17
     proof
    0.15
    vail
    0.14
    åī¯
    0.14
    abler
    0.14
     partial
    0.14
     dint
    0.14
    aku
    0.14
    ange
    0.14
    Act Density 0.003%

    No Known Activations