INDEX
Explanations
emotionally charged character experiences
New Auto-Interp
Negative Logits
slowdown
-0.15
overposting
-0.15
congestion
-0.15
complained
-0.14
Diversity
-0.13
congest
-0.13
Complaint
-0.12
Ùħض
-0.12
å¿Ļ
-0.12
complaints
-0.12
POSITIVE LOGITS
redemption
0.27
redeem
0.24
redeemed
0.23
betray
0.23
loyalty
0.22
betrayal
0.22
revenge
0.21
vengeance
0.21
murdering
0.21
betrayed
0.21
Activations Density 0.557%