INDEX
Explanations
inclusive and respectful behavior
New Auto-Interp
Negative Logits
this
0.45
3
0.44
very
0.43
very
0.41
te
0.40
irreversible
0.40
one
0.40
necessitating
0.40
n
0.40
ro
0.39
POSITIVE LOGITS
输出
0.53
Calories
0.49
Fitness
0.48
फिटनेस
0.46
Nutrition
0.46
प्रोटीन
0.45
营养
0.45
输出
0.43
moistur
0.43
khỏe
0.43
Activations Density 0.005%