INDEX
    Explanations

    numerical values and their context within texts

    New Auto-Interp
    Negative Logits
    -blind
    -0.15
    uru
    -0.15
     Pulse
    -0.15
    786
    -0.14
    pdo
    -0.14
     mouth
    -0.13
    ÙĪÙĨا
    -0.13
     cÃł
    -0.13
    nic
    -0.13
    aber
    -0.13
    POSITIVE LOGITS
    maz
    0.15
    esto
    0.14
    ĶĦ
    0.14
    tered
    0.14
    @student
    0.14
    ajs
    0.13
    dÄĽ
    0.13
    vable
    0.13
    åĸľ
    0.13
    yle
    0.13
    Act Density 0.051%

    No Known Activations