INDEX
    Explanations

    words that quantify an approximation or degree of existence

    New Auto-Interp
    Negative Logits
    ABS
    -0.15
    esan
    -0.14
    iye
    -0.14
    vas
    -0.14
    bsd
    -0.13
     somew
    -0.13
    scient
    -0.13
    irsch
    -0.13
    /REC
    -0.13
    974
    -0.13
    POSITIVE LOGITS
    abaj
    0.17
     sums
    0.15
    avou
    0.14
    STACK
    0.14
    κι
    0.14
    itious
    0.13
    umbo
    0.13
     Cous
    0.13
    -fed
    0.13
    ahir
    0.13
    Act Density 0.018%

    No Known Activations