INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.08
    inng
    -0.08
     Moy
    -0.08
    onga
    -0.08
     Wohnzimmer
    -0.08
     Lorraine
    -0.08
     lumi
    -0.08
    Superior
    -0.07
    innitus
    -0.07
    Floor
    -0.07
    POSITIVE LOGITS
     pair
    0.08
     numbered
    0.08
     sorted
    0.08
     wise
    0.08
    0.07
     integers
    0.07
     mostly
    0.07
    ील
    0.07
     forage
    0.07
     akik
    0.07
    Act Density 0.062%

    No Known Activations