INDEX
    Explanations

    terms indicating collective identity or inclusivity

    New Auto-Interp
    Negative Logits
    境的
    -0.68
    先の
    -0.66
    });
    
    -0.66
    :'/
    -0.65
     @"";
    -0.65
    ないで
    -0.63
    先に
    -0.63
     an
    -0.62
    vos
    -0.62
     σ
    -0.61
    POSITIVE LOGITS
     everyone
    2.34
    everyone
    2.29
    Everyone
    2.21
     Everyone
    2.21
     everybody
    2.17
    everybody
    2.14
     Everybody
    2.10
     EVERYONE
    2.04
    Everybody
    2.04
    everything
    1.57
    Act Density 0.033%

    No Known Activations