INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     discovery
    -0.08
     falsch
    -0.07
     cessation
    -0.07
     wrongly
    -0.07
    λί
    -0.07
    iles
    -0.07
     Discovery
    -0.07
    -0.07
     lifecycle
    -0.07
     nachhaltig
    -0.07
    POSITIVE LOGITS
     hoping
    0.10
    期待
    0.09
     positioned
    0.09
    hopefully
    0.08
     berharap
    0.08
     seasoned
    0.08
     daunting
    0.08
     Pereira
    0.08
     tackling
    0.08
     poised
    0.08
    Act Density 0.138%

    No Known Activations