INDEX
    Explanations

    Common sentence beginnings

    New Auto-Interp
    Negative Logits
     spokeswoman
    -0.07
     Shoes
    -0.06
     Bio
    -0.06
     grooming
    -0.06
     sexo
    -0.06
     обыч
    -0.06
    .nl
    -0.06
    ária
    -0.06
     spokesman
    -0.06
    optic
    -0.06
    POSITIVE LOGITS
    twig
    0.07
     malzem
    0.07
    .tight
    0.07
     stir
    0.06
    لع
    0.06
    .inflate
    0.06
    цез
    0.06
    нам
    0.06
     sher
    0.06
    LM
    0.06
    Act Density 0.053%

    No Known Activations