INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     health
    -1.41
    health
    -1.19
     Health
    -1.07
    Health
    -0.99
     HEALTH
    -0.96
    HEALTH
    -0.96
    健康
    -0.77
     kesehatan
    -0.75
     heath
    -0.74
     healthcare
    -0.70
    POSITIVE LOGITS
     跳转至
    0.55
    ponses
    0.52
     ostro
    0.50
    s
    0.50
     desta
    0.49
    yntaxException
    0.49
     syndromes
    0.49
     anvil
    0.49
     originais
    0.49
    ActionCreators
    0.49
    Act Density 0.111%

    No Known Activations