INDEX
    Explanations

    instead, without, unduly

    New Auto-Interp
    Negative Logits
     manej
    0.33
     fyra
    0.32
     charakter
    0.31
     toutefois
    0.31
     تے
    0.31
     ਅਤੇ
    0.30
     nummer
    0.30
    いくつかの
    0.30
     sowohl
    0.30
     außerdem
    0.30
    POSITIVE LOGITS
     вместо
    0.51
     unduly
    0.46
     или
    0.45
     χωρίς
    0.45
    而不
    0.44
     unnecessarily
    0.44
     instead
    0.44
     without
    0.42
     unfairly
    0.41
    0.40
    Act Density 0.121%

    No Known Activations