INDEX
Explanations
describing positive appearance
New Auto-Interp
Negative Logits
이자
0.39
पारदर्शिता
0.38
செயல
0.37
色彩
0.36
湄
0.36
uctive
0.35
ไซ
0.35
व्यवहार
0.34
意味着
0.34
functionality
0.34
POSITIVE LOGITS
looks
0.71
looks
0.70
wygląda
0.69
polished
0.65
wygl
0.64
выглядит
0.64
izgled
0.63
professional
0.61
polished
0.59
profesional
0.58
Activations Density 0.019%