INDEX
    Explanations

    phrases related to formal events and interactions

    New Auto-Interp
    Negative Logits
    andr
    -0.16
     harmless
    -0.15
    umba
    -0.15
    643
    -0.14
    iban
    -0.14
     simpler
    -0.13
    andre
    -0.13
    -neutral
    -0.13
    .toolbox
    -0.13
    ľ
    -0.12
    POSITIVE LOGITS
     formal
    0.43
    æŃ£å¼ı
    0.41
     official
    0.41
     Formal
    0.33
    official
    0.32
     formally
    0.32
     оÑĦиÑĨи
    0.31
     officially
    0.31
     full
    0.30
     Official
    0.30
    Act Density 0.086%

    No Known Activations