INDEX
    Explanations

    names that contain "ha" with varying levels of activation, potentially indicating a preference for a specific name or concept

    repeated occurrences of the substring "ha"

    New Auto-Interp
    Negative Logits
    papers
    -0.87
    atories
    -0.76
    rations
    -0.72
    ateur
    -0.70
    lace
    -0.67
    enhagen
    -0.66
    entric
    -0.65
     largeDownload
    -0.64
    lines
    -0.64
    parts
    -0.62
    POSITIVE LOGITS
    wn
    1.10
    iku
    0.96
    ha
    0.94
    pless
    0.92
    ichi
    0.90
    qua
    0.90
    jj
    0.89
    fter
    0.85
    pton
    0.84
    pp
    0.84
    Act Density 0.010%

    No Known Activations