INDEX
    Explanations

    numerical values and references that might denote statistical or academic data

    New Auto-Interp
    Negative Logits
    etta
    -0.17
    pais
    -0.15
    ä¿¡
    -0.14
     ADV
    -0.14
     PEN
    -0.14
    iom
    -0.14
    opers
    -0.14
    iland
    -0.14
     Dame
    -0.14
    esting
    -0.14
    POSITIVE LOGITS
     Burl
    0.16
    &C
    0.15
    ilogy
    0.15
     filetype
    0.15
     Rae
    0.15
    _softmax
    0.14
     Hastings
    0.14
    Ńå·ŀ
    0.14
    jack
    0.14
    kaar
    0.14
    Act Density 0.018%

    No Known Activations