INDEX
    Explanations

    references to significant historical figures and their contributions to science

    New Auto-Interp
    Negative Logits
    iyorum
    -0.15
    itÃł
    -0.15
    avor
    -0.14
    -urlencoded
    -0.14
    ocha
    -0.14
    ávka
    -0.14
    odelist
    -0.14
    llib
    -0.14
    phinx
    -0.13
    itori
    -0.13
    POSITIVE LOGITS
     would
    0.86
    would
    0.76
     Would
    0.74
    Would
    0.70
     wouldn
    0.59
     skulle
    0.53
     zou
    0.51
     serait
    0.49
     würde
    0.49
     Wouldn
    0.47
    Act Density 0.389%

    No Known Activations