INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     health
    -0.58
    __]
    -0.55
    oston
    -0.52
     überweisen
    -0.50
     Health
    -0.49
    health
    -0.48
     gezond
    -0.47
     operation
    -0.47
    Demografía
    -0.47
     Ausstattung
    -0.47
    POSITIVE LOGITS
    AutoScaleMode
    0.68
    Geplaatst
    0.67
    aarrggbb
    0.66
    mahaman
    0.63
    gyz
    0.63
    matchCondition
    0.62
    )['
    0.60
     noDo
    0.59
     niyang
    0.59
    instancetype
    0.58
    Act Density 0.057%

    No Known Activations