INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    mare
    -0.08
     Stanton
    -0.08
    -0.08
    kwa
    -0.07
    кач
    -0.07
     Sherman
    -0.07
    Mor
    -0.07
     horário
    -0.07
    ivity
    -0.07
     Rogers
    -0.07
    POSITIVE LOGITS
     fe
    0.09
     Rome
    0.09
     civilizations
    0.08
    -fashioned
    0.08
    0.08
    文明
    0.08
     Society
    0.07
     hech
    0.07
     Vlad
    0.07
    医学
    0.07
    Act Density 0.012%

    No Known Activations