INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     to
    -0.78
     from
    -0.75
     into
    -0.74
     his
    -0.72
     we
    -0.72
     nor
    -0.71
     beyond
    -0.71
     with
    -0.71
     now
    -0.70
     on
    -0.69
    POSITIVE LOGITS
     varandra
    1.01
     stället
    0.97
     lèvres
    0.91
     médicaments
    0.90
     vaisseaux
    0.90
     skydd
    0.88
     courants
    0.88
     äldre
    0.87
     säll
    0.85
     sociala
    0.83
    Act Density 0.239%

    No Known Activations