INDEX
    Explanations

    references to projects and their progress

    New Auto-Interp
    Negative Logits
     darn
    -0.20
     folks
    -0.17
    ">&#
    -0.17
    ebi
    -0.16
    quip
    -0.15
     folk
    -0.15
    éĻ
    -0.15
    folk
    -0.15
    AIT
    -0.15
     ÑĤÑĢен
    -0.14
    POSITIVE LOGITS
     fuck
    0.25
     âĢŀ
    0.25
    fuck
    0.22
     fucked
    0.21
    0.20
     fucks
    0.20
     FUCK
    0.19
     fucking
    0.19
     kind
    0.19
     cunt
    0.19
    Act Density 0.007%

    No Known Activations