INDEX
    Explanations

    names or labels

    phrases that indicate listing or naming items or examples

    New Auto-Interp
    Negative Logits
    ysc
    -0.69
    entimes
    -0.68
    loo
    -0.67
    issance
    -0.67
     childbirth
    -0.63
     deterior
    -0.63
    tail
    -0.60
     propelled
    -0.60
    Returns
    -0.60
    depth
    -0.60
    POSITIVE LOGITS
     culprit
    0.86
     names
    0.85
     specific
    0.81
     perpetrators
    0.76
     NCT
    0.74
     blame
    0.74
     Names
    0.73
    GROUP
    0.72
     perpetrator
    0.71
     particular
    0.71
    Act Density 0.317%

    No Known Activations