INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     with
    -1.40
     within
    -1.17
     from
    -1.16
     for
    -1.01
     whose
    -0.99
     to
    -0.97
     that
    -0.96
     by
    -0.96
    With
    -0.93
     into
    -0.87
    POSITIVE LOGITS
     décider
    0.95
     contribuer
    0.86
     hivyo
    0.85
     couverts
    0.84
     poca
    0.82
     résister
    0.81
    LITERAL
    0.81
     wis
    0.81
     distrik
    0.81
    ablemente
    0.80
    Act Density 0.006%

    No Known Activations