INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    "It
    -0.07
     \""
    -0.07
    -it
    -0.07
    icolor
    -0.06
     Beer
    -0.06
    Kit
    -0.06
    _Set
    -0.06
    jit
    -0.06
    lier
    -0.06
    Plot
    -0.06
    POSITIVE LOGITS
     Trans
    0.15
    Trans
    0.15
     trans
    0.14
    	trans
    0.11
    .trans
    0.11
    .Trans
    0.10
    trans
    0.10
    -trans
    0.10
    _trans
    0.10
     transgender
    0.10
    Act Density 0.021%

    No Known Activations