INDEX
    Explanations

    references to positions of authority or leadership roles

    New Auto-Interp
    Negative Logits
     copy
    -0.47
     toll
    -0.45
    lccc
    -0.44
    لاف
    -0.43
    委托
    -0.41
     Copy
    -0.41
     zor
    -0.41
    thouse
    -0.40
     rå
    -0.40
     dumpster
    -0.40
    POSITIVE LOGITS
     Italijani
    0.69
    InjectAttribute
    0.69
     BorderRadius
    0.68
    BackStack
    0.66
    ItemBackground
    0.64
    TintMode
    0.64
    LayoutStyle
    0.64
     CreateTagHelper
    0.63
     karier
    0.63
    :✨
    0.61
    Act Density 0.343%

    No Known Activations