INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    united
    -1.23
     united
    -1.21
    United
    -1.18
     UNITED
    -1.08
     United
    -0.95
    UNITED
    -0.89
     ujednoznacz
    -0.84
     oxygen
    -0.81
     advanced
    -0.81
    oxygen
    -0.77
    POSITIVE LOGITS
     States
    0.72
    GOTREF
    0.64
    SequentialGroup
    0.63
    kheim
    0.56
    riwal
    0.55
    itects
    0.53
    ')));
    0.53
    InputTagHelper
    0.52
    ristmas
    0.52
     STATES
    0.51
    Act Density 1.583%

    No Known Activations