INDEX
    Explanations

    entities related to historical figures and their relationships

    New Auto-Interp
    Negative Logits
    iram
    -0.15
    reater
    -0.15
    quirrel
    -0.15
    ÅĻi
    -0.14
    rů
    -0.14
    Earn
    -0.14
    niej
    -0.14
     Kral
    -0.14
    erca
    -0.14
     klu
    -0.14
    POSITIVE LOGITS
    akh
    0.20
    zh
    0.20
     Volk
    0.19
     Push
    0.19
    ugin
    0.18
    achen
    0.18
    enin
    0.18
     Tro
    0.17
    istrat
    0.17
    agina
    0.16
    Act Density 0.088%

    No Known Activations