INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    tested
    -0.07
     Datum
    -0.06
     conceptual
    -0.06
     ger
    -0.06
     DF
    -0.06
    Fan
    -0.06
     Regression
    -0.06
     avalanche
    -0.06
     Licence
    -0.06
     anthem
    -0.06
    POSITIVE LOGITS
     energia
    0.15
    nergie
    0.10
    =========↵
    0.07
     Energ
    0.07
    _receive
    0.06
    OTT
    0.06
     енерг
    0.06
     هست
    0.06
     energie
    0.06
    acje
    0.06
    Act Density 0.005%

    No Known Activations