INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     varargin
    -0.07
     visions
    -0.07
    зна
    -0.06
     gated
    -0.06
    IVEN
    -0.06
    ίζ
    -0.06
    -0.06
    ght
    -0.06
    UNKNOWN
    -0.06
    CURRENT
    -0.06
    POSITIVE LOGITS
    uppy
    0.07
     Atom
    0.06
    .grp
    0.06
     should
    0.06
    xE
    0.06
    0.06
     Frost
    0.06
     korun
    0.06
    .over
    0.06
    _components
    0.06
    Act Density 0.032%

    No Known Activations