INDEX
    Explanations

    quantifiers and numbers used for comparison

    New Auto-Interp
    Negative Logits
    oog
    -0.15
    otu
    -0.14
    uito
    -0.14
    osit
    -0.14
    \Carbon
    -0.13
    inium
    -0.13
    infeld
    -0.13
    bens
    -0.13
    untas
    -0.13
    atas
    -0.13
    POSITIVE LOGITS
    aml
    0.15
     Parkway
    0.14
    orum
    0.14
     Petr
    0.14
     Pamela
    0.14
    ente
    0.14
    ter
    0.14
    pad
    0.13
    elden
    0.13
    idon
    0.13
    Act Density 0.219%

    No Known Activations