INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    bourne
    -0.07
     wors
    -0.07
     strategic
    -0.07
    WEST
    -0.07
     Ri
    -0.07
    _timing
    -0.06
     Nim
    -0.06
     Timothy
    -0.06
    sch
    -0.06
    σφ
    -0.06
    POSITIVE LOGITS
     좋은
    0.07
    .Maximum
    0.06
    0.06
    alta
    0.06
    Eliminar
    0.06
    orno
    0.05
    ********************
    0.05
    何か
    0.05
     Québec
    0.05
     fourn
    0.05
    Act Density 0.049%

    No Known Activations