INDEX
    Explanations

    references to the word "fort" and its variations

    New Auto-Interp
    Negative Logits
    ikan
    -0.16
    PECT
    -0.15
    perator
    -0.15
    å±Ģ
    -0.15
    aterno
    -0.15
    bsub
    -0.15
    ÅĤu
    -0.15
    iyah
    -0.15
    æļĸ
    -0.14
    æĹ
    -0.14
    POSITIVE LOGITS
    aleza
    0.31
    una
    0.31
    unes
    0.30
    resses
    0.30
    ress
    0.30
    uit
    0.29
    une
    0.29
    ifications
    0.29
    itude
    0.28
    ification
    0.26
    Act Density 0.011%

    No Known Activations