INDEX
    Explanations

    references to challenges and resources in various contexts

    New Auto-Interp
    Negative Logits
    flen
    -0.14
    elig
    -0.14
    .allow
    -0.14
    ende
    -0.13
     Fucked
    -0.13
     lending
    -0.13
     ваг
    -0.13
    unami
    -0.13
    eron
    -0.13
    ï
    -0.13
    POSITIVE LOGITS
     consume
    0.47
     consumes
    0.41
     Consum
    0.41
     consuming
    0.40
    consume
    0.39
     eats
    0.38
     drain
    0.36
     eat
    0.35
     Drain
    0.33
     devour
    0.33
    Act Density 0.263%

    No Known Activations