INDEX
    Explanations

    names of historical figures and their affiliations

    New Auto-Interp
    Negative Logits
    ts
    -0.26
    ta
    -0.24
    tn
    -0.24
    to
    -0.24
    tes
    -0.23
    techn
    -0.23
    tem
    -0.23
    tek
    -0.23
    td
    -0.23
    tools
    -0.23
    POSITIVE LOGITS
    ki
    0.29
    ky
    0.26
    dorf
    0.26
    ký
    0.26
    cheid
    0.26
    chaft
    0.25
    hire
    0.24
    piration
    0.24
    s
    0.23
    chrift
    0.23
    Act Density 0.124%

    No Known Activations