INDEX
    Explanations

    occurrences of the letter 'H', often as part of proper nouns or titles

    New Auto-Interp
    Negative Logits
    ello
    -0.27
    ex
    -0.21
    ere
    -0.20
    OST
    -0.19
    ER
    -0.19
    ave
    -0.18
    ERE
    -0.18
    er
    -0.18
    HF
    -0.18
    z
    -0.18
    POSITIVE LOGITS
    r
    0.27
    ruby
    0.20
    s
    0.20
    rab
    0.20
    MAS
    0.20
    sing
    0.19
    rst
    0.19
    rish
    0.19
    rad
    0.19
    ORIZONTAL
    0.19
    Act Density 0.099%

    No Known Activations