INDEX
    Explanations

    terms related to social hierarchies and class systems

    New Auto-Interp
    Negative Logits
    ém
    -0.15
    ladu
    -0.15
    eree
    -0.15
    steen
    -0.14
    .flag
    -0.14
    ÑĨин
    -0.14
    ynes
    -0.14
    ewe
    -0.14
    _descr
    -0.13
    importe
    -0.13
    POSITIVE LOGITS
    302
    0.17
    allet
    0.17
    NH
    0.16
    567
    0.15
    145
    0.15
    IFF
    0.15
    444
    0.14
    271
    0.14
    701
    0.14
     Bris
    0.14
    Act Density 0.101%

    No Known Activations