INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     decor
    -0.68
     Painter
    -0.68
     architecture
    -0.63
     adm
    -0.59
     shadow
    -0.58
     exact
    -0.57
     fire
    -0.57
     ages
    -0.57
     balances
    -0.57
     fires
    -0.57
    POSITIVE LOGITS
    pmwiki
    0.93
    1
    0.88
    2
    0.84
    embed
    0.83
    gp
    0.79
    TPPStreamerBot
    0.78
    english
    0.78
    kr
    0.77
    bleacher
    0.77
    vc
    0.76
    Act Density 0.012%

    No Known Activations