INDEX
    Explanations

    the string "Han" and related words

    the name "Han" across various contexts

    New Auto-Interp
    Negative Logits
    anwhile
    -0.84
     destro
    -0.82
    utics
    -0.71
    URES
    -0.70
    Downloadha
    -0.70
    ktop
    -0.68
    okemon
    -0.68
    ODUCT
    -0.67
    atoon
    -0.67
    terday
    -0.67
    POSITIVE LOGITS
    auer
    0.97
    uman
    0.95
    ning
    0.91
    ifa
    0.89
    wei
    0.88
    lon
    0.87
     Solo
    0.87
    bang
    0.86
    wal
    0.82
    hart
    0.81
    Act Density 0.008%

    No Known Activations