INDEX
Explanations
elements related to personal experiences and self-reflection
New Auto-Interp
Negative Logits
".
-0.73
])));
-0.72
.•
-0.70
NSCoder
-0.69
تقاوى
-0.69
'],
-0.68
};*/
-0.68
/">
-0.68
+
-0.68
%";
-0.68
POSITIVE LOGITS
lol
2.40
LOL
2.16
haha
2.16
lol
1.98
hahaha
1.94
LOL
1.91
Lol
1.90
jajaja
1.79
😂
1.79
hehe
1.77
Activations Density 0.616%