INDEX
Explanations
expressions of realization or moments of insight regarding serious topics
New Auto-Interp
Negative Logits
abby
-0.17
eya
-0.14
aras
-0.14
à¹Ģà¸ģ
-0.14
intl
-0.13
ilyn
-0.13
//{{-0.13
еÑģÑĮ
-0.13
ILA
-0.13
enheim
-0.13
POSITIVE LOGITS
serious
1.18
Serious
1.05
seriousness
1.01
serious
0.98
seriously
0.88
-ser
0.79
Seriously
0.74
seri
0.69
Seriously
0.68
Ser
0.66
Activations Density 0.027%