INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    7
    -0.09
    13
    -0.09
    11
    -0.09
    9
    -0.09
    6
    -0.09
    10
    -0.08
    18
    -0.08
    12
    -0.08
    5
    -0.08
    -0.07
    POSITIVE LOGITS
     superv
    0.07
    viously
    0.07
    hol
    0.06
    bos
    0.06
     Reflect
    0.06
    lif
    0.06
     Üst
    0.06
    hlas
    0.06
    unsqueeze
    0.06
    Atual
    0.06
    Act Density 0.556%

    No Known Activations