INDEX
Explanations
themes of sacrifice and endurance in challenging situations
New Auto-Interp
Negative Logits
Sinai
-0.16
gin
-0.15
illin
-0.14
amax
-0.14
_SER
-0.14
umbn
-0.14
yourself
-0.14
th
-0.14
då
-0.13
yourselves
-0.13
POSITIVE LOGITS
la
0.44
las
0.43
los
0.42
su
0.42
el
0.42
sus
0.32
su
0.31
_su
0.28
la
0.27
-su
0.27
Activations Density 0.089%