INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    :Set
    -0.08
    =sub
    -0.07
     PLAY
    -0.07
     screened
    -0.06
     BP
    -0.06
     phy
    -0.06
    来进行
    -0.06
     Williams
    -0.06
     นอกจาก
    -0.06
     Sets
    -0.06
    POSITIVE LOGITS
    ęb
    0.07
    ι
    0.07
    ave
    0.07
    Ar
    0.07
    //----------------------------------------------------------------
    0.07
     דרכים
    0.06
    amiento
    0.06
    老太太
    0.06
    _branch
    0.06
    #----------------------------------------------------------------
    0.06
    Act Density 0.002%

    No Known Activations