INDEX
Explanations
expressions related to emotional reactions or states
New Auto-Interp
Negative Logits
festive
-0.69
sugg
-0.69
ãĥĩãĤ£
-0.67
ãĥ©ãĥ³
-0.63
xtap
-0.62
coh
-0.59
ãĥĹ
-0.57
å¼
-0.56
Points
-0.56
Branch
-0.56
POSITIVE LOGITS
!--
0.83
Himself
0.78
anyways
0.74
nesses
0.73
anyway
0.72
selves
0.72
ALSO
0.72
nonetheless
0.71
insofar
0.71
WHEN
0.69
Activations Density 0.478%