INDEX
    Explanations

    Contains "rh" or "h" in words

    New Auto-Interp
    Negative Logits
     Monter
    -0.08
    -0.07
     таким
    -0.07
    commit
    -0.07
    здание
    -0.07
     pint
    -0.07
    -0.07
    śmie
    -0.07
    -0.07
     setContentView
    -0.07
    POSITIVE LOGITS
    ighb
    0.07
    經常
    0.07
    œur
    0.07
    _Last
    0.07
    涌现出
    0.07
     duplicates
    0.06
     guessed
    0.06
    .about
    0.06
    Wonder
    0.06
    脸上
    0.06
    Act Density 0.014%

    No Known Activations