INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    EMPLARY
    -0.07
    보고
    -0.07
     utterly
    -0.06
     Anniversary
    -0.06
     Recipe
    -0.06
     Floor
    -0.06
     consulta
    -0.06
    TABLE
    -0.06
    าภ
    -0.06
     Ella
    -0.06
    POSITIVE LOGITS
     ignite
    0.07
    ξης
    0.07
    inactive
    0.07
     ket
    0.07
    -operative
    0.07
    βι
    0.07
    0.07
    Forward
    0.06
    rix
    0.06
    ープ
    0.06
    Act Density 0.013%

    No Known Activations