INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    second
    -0.07
     datab
    -0.07
    _diag
    -0.07
     Doub
    -0.07
     Dispatcher
    -0.07
     Do
    -0.07
    ________________________________________________________________
    -0.07
     Dahl
    -0.07
     Ri
    -0.06
    новаж
    -0.06
    POSITIVE LOGITS
     October
    0.07
     Oct
    0.07
    itorio
    0.07
     '%'
    0.07
    Attend
    0.06
     oats
    0.06
    攻撃
    0.06
     metabol
    0.06
    Oct
    0.06
    AT
    0.06
    Act Density 0.020%

    No Known Activations