INDEX
Explanations
phrases related to formal events and interactions
New Auto-Interp
Negative Logits
andr
-0.16
harmless
-0.15
umba
-0.15
643
-0.14
iban
-0.14
simpler
-0.13
andre
-0.13
-neutral
-0.13
.toolbox
-0.13
ľ
-0.12
POSITIVE LOGITS
formal
0.43
æŃ£å¼ı
0.41
official
0.41
Formal
0.33
official
0.32
formally
0.32
оÑĦиÑĨи
0.31
officially
0.31
full
0.30
Official
0.30
Activations Density 0.086%