INDEX
    Explanations

    phrases that indicate simplicity or ease of use

    New Auto-Interp
    Negative Logits
    _sink
    -0.16
    pire
    -0.15
    Unchecked
    -0.15
    mers
    -0.15
    iate
    -0.14
    åĵģ
    -0.14
    anse
    -0.14
     Pins
    -0.14
    ogeneous
    -0.14
    mys
    -0.14
    POSITIVE LOGITS
     dÃłng
    0.24
    ened
    0.21
    ening
    0.20
    -to
    0.18
    going
    0.17
    /simple
    0.17
    xes
    0.16
    oload
    0.16
    /latest
    0.16
    (er
    0.15
    Act Density 0.039%

    No Known Activations