INDEX
    Explanations

    phrases indicating small quantities or degrees

    New Auto-Interp
    Negative Logits
    ed
    -0.19
    ftware
    -0.16
     somewhat
    -0.16
    edo
    -0.15
    eded
    -0.15
    δί
    -0.15
     slightly
    -0.14
    nt
    -0.14
     intended
    -0.14
    hi
    -0.14
    POSITIVE LOGITS
    /stdc
    0.28
    umen
    0.27
    .ly
    0.25
    mapped
    0.21
    Torrent
    0.20
    rary
    0.20
    ingly
    0.20
    umin
    0.20
     more
    0.20
    tern
    0.18
    Act Density 0.018%

    No Known Activations