INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    :✨
    -0.43
      
    -0.43
    Personensuche
    -0.43
    <h3>
    -0.42
     defStyle
    -0.41
    play
    -0.41
    <eos>
    -0.40
     شاد
    -0.39
    OrEmpty
    -0.38
    !
    -0.38
    POSITIVE LOGITS
    According
    1.14
     According
    1.05
    according
    0.68
     חיצוניים
    0.64
     according
    0.63
    Secondo
    0.63
     שוליים
    0.60
    Según
    0.58
    addPreferredGap
    0.58
     Secondo
    0.57
    Act Density 0.011%

    No Known Activations