INDEX
    Explanations

    appropriate or inappropriate behavior

    New Auto-Interp
    Negative Logits
     sayesinde
    0.68
    Plus
    0.68
    Gracias
    0.66
     Needed
    0.65
    Available
    0.64
    便于
    0.62
     ईमान
    0.62
     nötig
    0.61
    Forbidden
    0.60
    Required
    0.60
    POSITIVE LOGITS
     practices
    1.15
    erweise
    0.97
     behavior
    0.97
     طریقے
    0.91
     behaviors
    0.91
     Practices
    0.87
     behaviour
    0.86
     behaviours
    0.86
     comportamento
    0.80
    practices
    0.79
    Act Density 0.457%

    No Known Activations