INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.08
     Faul
    -0.07
    โฆษ
    -0.07
    -0.07
    퀀
    -0.06
     טבע
    -0.06
    -0.06
    ten
    -0.06
     yuk
    -0.06
    więks
    -0.06
    POSITIVE LOGITS
     independently
    0.08
    Interrupt
    0.07
     Programs
    0.07
    progress
    0.07
     >&
    0.07
    _scheme
    0.07
    needs
    0.07
     Enforcement
    0.07
    bourne
    0.07
     proposals
    0.06
    Act Density 0.001%

    No Known Activations