INDEX
Explanations
expressions of excitement and pride about personal or professional achievements
New Auto-Interp
Negative Logits
ÑĥÑĪ
-0.16
utory
-0.15
ald
-0.14
Yin
-0.14
ates
-0.14
genius
-0.14
de
-0.14
oad
-0.14
/rc
-0.14
,
-0.14
POSITIVE LOGITS
è¿Ļä¹Ī
0.22
à¤ĩतन
0.21
ãģĵãĤĵãģª
0.20
å¦ĤæŃ¤
0.19
ìĿ´ëłĩê²Į
0.19
böyle
0.17
tão
0.16
è¿Ļæł·
0.15
-gnu
0.15
åĽ½äº§
0.15
Activations Density 0.055%