INDEX
    Explanations

    phrases related to historical figures and significant events

    New Auto-Interp
    Negative Logits
    wart
    -0.16
     Turnbull
    -0.15
    /*č↵
    -0.15
    Ñģед
    -0.15
    okud
    -0.15
    iben
    -0.14
     aba
    -0.14
    SSI
    -0.14
    .sg
    -0.14
    ibu
    -0.13
    POSITIVE LOGITS
    imat
    0.16
    ecies
    0.16
     same
    0.14
    ogle
    0.14
    leton
    0.14
    odge
    0.14
     cast
    0.14
    ertz
    0.14
    yal
    0.13
    akah
    0.13
    Act Density 1.972%

    No Known Activations