INDEX
Explanations
sentiments of regret and reflection on past decisions
New Auto-Interp
Negative Logits
dov
-0.14
fol
-0.14
punch
-0.14
owitz
-0.14
jer
-0.14
(;;
-0.14
zeroes
-0.14
tossed
-0.14
ahr
-0.13
punches
-0.13
POSITIVE LOGITS
Sorted
0.19
#ab
0.19
fab
0.18
sorted
0.18
chers
0.17
££
0.17
Sorted
0.17
moz
0.16
extortion
0.16
Shall
0.15
Activations Density 0.264%