INDEX
Explanations
references to brokenness or injury
New Auto-Interp
Negative Logits
ropa
-0.16
ousse
-0.16
(||
-0.16
çĦ¼
-0.16
rex
-0.15
rove
-0.15
gard
-0.15
_Execute
-0.14
rame
-0.14
CursorPosition
-0.14
POSITIVE LOGITS
broken
0.25
-hearted
0.24
broken
0.24
Broken
0.23
heart
0.23
ess
0.21
Broken
0.20
-backed
0.19
shattered
0.19
broke
0.18
Activations Density 0.018%