INDEX
    Explanations

    numeric values or quantifiers

    numbers, percentages, code

    New Auto-Interp
    Negative Logits
    @@@@@@@@
    -0.51
    principalTable
    -0.47
     ……………………
    -0.47
    >>>>>>>>
    -0.47
     …………
    -0.46
    ###############
    -0.44
    ****************
    -0.44
    ………………………………
    -0.43
    ……………………
    -0.43
    ////////////////
    -0.42
    POSITIVE LOGITS
    <bos>
    0.90
     digitais
    0.67
    pulseira
    0.65
     мәкал
    0.64
     engraçadas
    0.63
     mesmas
    0.63
     ancaman
    0.62
     traseira
    0.62
    sapato
    0.61
     dianteira
    0.60
    Act Density 0.292%

    No Known Activations