INDEX
Explanations
expressions of frustration or overwhelming feelings regarding personal choices and societal expectations
New Auto-Interp
Negative Logits
anel
-0.15
val
-0.14
aki
-0.14
rowse
-0.14
dej
-0.14
TRS
-0.14
Aub
-0.14
611
-0.13
angu
-0.13
erald
-0.13
POSITIVE LOGITS
underst
0.14
tend
0.14
mus
0.14
Normally
0.14
ãĤ¶ãĥ¼
0.14
寸
0.14
enko
0.14
imus
0.14
irsch
0.13
tends
0.13
Activations Density 0.283%