INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     healthy
    -1.50
    healthy
    -1.35
    Healthy
    -1.25
     saudável
    -1.09
     Healthy
    -1.07
     gesunde
    -1.06
     sustainable
    -0.96
     healthier
    -0.95
     sehat
    -0.94
     healthiest
    -0.94
    POSITIVE LOGITS
    gonic
    0.55
    d
    0.54
    RenderAtEndOf
    0.51
    ه
    0.48
    eará
    0.46
    g
    0.46
     Sheeran
    0.44
    ikian
    0.44
    ')}}
    0.43
     old
    0.43
    Act Density 0.174%

    No Known Activations