INDEX
    Explanations

    punctuation marks that indicate enumeration or lists

    New Auto-Interp
    Negative Logits
    arie
    -0.61
    ropolitan
    -0.61
    yon
    -0.61
    uber
    -0.59
     Moines
    -0.57
    gow
    -0.57
    paren
    -0.56
    orget
    -0.55
    vae
    -0.54
    irs
    -0.54
    POSITIVE LOGITS
     prompting
    1.18
     hence
    1.09
     thus
    1.04
     resulting
    1.03
     respectively
    1.02
     although
    1.02
     albeit
    1.01
     implying
    1.00
     namely
    0.99
     whereas
    0.99
    Act Density 0.274%

    No Known Activations