INDEX
    Explanations

    references to social issues and cultural critiques

    New Auto-Interp
    Negative Logits
    Whilst
    -1.29
     Whilst
    -1.29
    非常的
    -1.22
     poichè
    -1.12
     whilst
    -1.12
     dimana
    -1.08
     içerisinde
    -1.07
     diatas
    -1.07
     می‌باشد
    -1.05
     didalam
    -1.05
    POSITIVE LOGITS
     freilich
    1.35
    たとえば
    0.83
    ־
    0.80
     voilà
    0.80
    0.79
     guère
    0.79
     etwa
    0.78
    けっこう
    0.78
     ostensibly
    0.78
    eabouts
    0.78
    Act Density 3.493%

    No Known Activations