INDEX
Explanations
terms associated with fragility and delicateness
New Auto-Interp
Negative Logits
noires
-0.56
хьтан
-0.54
plufieurs
-0.52
Wikimedijinoj
-0.52
hitam
-0.52
seamnă
-0.49
føring
-0.49
politiet
-0.49
cerrados
-0.48
httphttps
-0.47
POSITIVE LOGITS
weak
1.02
weak
0.93
fragile
0.91
fragility
0.85
Weak
0.84
Weak
0.83
weaker
0.82
flimsy
0.79
weakly
0.77
thin
0.76
Activations Density 0.960%