INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Root
    -0.08
    """
    -0.07
    .Power
    -0.07
    power
    -0.07
     Heavy
    -0.07
    Power
    -0.07
    _PREVIEW
    -0.07
     separating
    -0.06
    entity
    -0.06
     ==========
    -0.06
    POSITIVE LOGITS
     labs
    0.11
     Labs
    0.11
     Lab
    0.10
    Lab
    0.09
    lab
    0.09
     LAB
    0.09
     lab
    0.08
    ubs
    0.07
    LAB
    0.07
     champs
    0.07
    Act Density 0.010%

    No Known Activations