INDEX
    Explanations

    proper nouns, particularly names of individuals and organizations

    New Auto-Interp
    Negative Logits
    bsite
    -0.16
    ÄĻ
    -0.15
    ipur
    -0.15
    phinx
    -0.15
    ustum
    -0.14
    érc
    -0.14
    ixin
    -0.14
    ffen
    -0.14
    prompt
    -0.14
     simply
    -0.14
    POSITIVE LOGITS
    son
    0.14
    éĽĦ
    0.14
    erts
    0.13
    ãĥ³ãĥķ
    0.13
    bie
    0.13
    ubbles
    0.13
     Herbert
    0.13
    dÄĽl
    0.13
    arity
    0.13
     scar
    0.13
    Act Density 0.205%

    No Known Activations