INDEX
    Explanations

    URLs and links to online resources, particularly from GitHub and Twitter

    New Auto-Interp
    Negative Logits
    ĪĴ
    -0.80
    ulhu
    -0.73
    ornia
    -0.73
    onics
    -0.66
    ERO
    -0.64
    Medic
    -0.63
     Reincarn
    -0.62
     proportions
    -0.62
     mug
    -0.62
     Siren
    -0.62
    POSITIVE LOGITS
    groups
    0.79
    pages
    0.71
     buttons
    0.70
    },"
    0.69
    username
    0.68
    theless
    0.68
    foo
    0.66
    / 
    0.66
    ":["
    0.65
     Michaels
    0.65
    Act Density 0.052%

    No Known Activations