INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    <Node
    -0.08
    riz
    -0.08
    .transpose
    -0.07
    冲刺
    -0.07
    thalm
    -0.07
     Koch
    -0.07
     Turn
    -0.07
    _Name
    -0.07
     IMAGE
    -0.07
    inth
    -0.07
    POSITIVE LOGITS
     abb
    0.07
     folks
    0.07
     hob
    0.06
    Gil
    0.06
     ministers
    0.06
     spęd
    0.06
    Intialized
    0.06
    0.06
     ladies
    0.06
    _blocking
    0.06
    Act Density 0.007%

    No Known Activations