INDEX
Explanations
themes of punishment and revenge
New Auto-Interp
Negative Logits
.pull
-0.14
.opend
-0.14
Alarm
-0.14
æĥij
-0.14
Alarm
-0.14
èµŀ
-0.14
vac
-0.13
aji
-0.13
è³¢
-0.13
(Optional
-0.13
POSITIVE LOGITS
revenge
0.52
vengeance
0.47
Revenge
0.40
retaliation
0.40
vend
0.39
venge
0.38
justice
0.35
retal
0.34
av
0.34
repr
0.34
Activations Density 0.194%