INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     tragedy
    1.36
     stimulant
    1.17
     influential
    1.16
     commodity
    1.16
     revolutionary
    1.16
     calamity
    1.16
     contributor
    1.15
     comedian
    1.14
     reactionary
    1.13
     thing
    1.13
    POSITIVE LOGITS
    各自
    1.09
    它们的
    0.94
     đều
    0.89
    们的
    0.85
    неш
    0.84
    ทุกคน
    0.84
     зовніш
    0.82
     सरकारों
    0.82
    ombres
    0.80
    들은
    0.78
    Act Density 0.275%

    No Known Activations