INDEX
Explanations
themes of sacrifice and social justice
New Auto-Interp
Negative Logits
째
-0.17
ighter
-0.16
ưng
-0.15
SCAN
-0.15
ods
-0.14
SCAN
-0.14
бол
-0.14
hart
-0.14
oded
-0.14
boy
-0.14
POSITIVE LOGITS
isel
0.16
ặt
0.15
ÙİØ£
0.15
untas
0.15
icers
0.14
ropol
0.14
ukan
0.14
Hunt
0.14
emies
0.14
Hop
0.14
Activations Density 0.067%