INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    观念
    -0.75
    ininger
    -0.74
     kän
    -0.73
    -0.71
     imprimé
    -0.70
     richt
    -0.70
    strap
    -0.69
    peria
    -0.69
     Einsatz
    -0.68
     doré
    -0.68
    POSITIVE LOGITS
     move
    2.89
    move
    2.73
    Move
    2.38
     Move
    2.22
    MOVE
    2.16
     moved
    2.00
     moving
    1.88
     MOVE
    1.88
    ToMove
    1.79
     moves
    1.77
    Act Density 0.022%

    No Known Activations