INDEX
    Explanations

    syntactical constructs and function definitions in code, particularly method declarations in programming languages

    New Auto-Interp
    Negative Logits
    wick
    -0.17
    bbc
    -0.17
    ainter
    -0.16
    stanov
    -0.15
    aticon
    -0.14
     Bucc
    -0.14
    atatype
    -0.14
    anvas
    -0.14
    ottom
    -0.14
    anded
    -0.14
    POSITIVE LOGITS
    idon
    0.16
    IZER
    0.16
    aler
    0.14
    zelf
    0.14
     Col
    0.14
    uzzi
    0.14
     heads
    0.13
    col
    0.13
    adh
    0.13
    .setter
    0.13
    Act Density 0.003%

    No Known Activations