INDEX
    Explanations

    references to historical figures and their contributions

    New Auto-Interp
    Negative Logits
    šak
    -0.17
    antal
    -0.16
    cest
    -0.15
    liers
    -0.15
    vox
    -0.14
    lož
    -0.14
     bri
    -0.14
    aan
    -0.14
    luk
    -0.13
    uite
    -0.13
    POSITIVE LOGITS
     Hel
    0.26
     Fel
    0.23
    Hel
    0.22
    Fel
    0.20
     Gel
    0.19
     fel
    0.19
     HEL
    0.19
    à¥ĩल
    0.18
     Felix
    0.18
    fel
    0.18
    Act Density 0.068%

    No Known Activations