INDEX
    Explanations

    inclusive and respectful behavior

    New Auto-Interp
    Negative Logits
    this
    0.45
    3
    0.44
     very
    0.43
    very
    0.41
    te
    0.40
     irreversible
    0.40
    one
    0.40
     necessitating
    0.40
    n
    0.40
    ro
    0.39
    POSITIVE LOGITS
     输出
    0.53
    Calories
    0.49
    Fitness
    0.48
     फिटनेस
    0.46
    Nutrition
    0.46
     प्रोटीन
    0.45
    营养
    0.45
    输出
    0.43
     moistur
    0.43
     khỏe
    0.43
    Act Density 0.005%

    No Known Activations