INDEX
Explanations
references to personal experiences and narratives involving familial relationships and individual challenges
New Auto-Interp
Negative Logits
<eos>
-0.42
des
-0.42
sur
-0.38
-0.38
la
-0.37
look
-0.37
lo
-0.36
peu
-0.36
mise
-0.35
lot
-0.34
POSITIVE LOGITS
myſelf
1.06
Himself
0.97
himſelf
0.96
himself
0.96
AssemblyCompany
0.93
Efq
0.91
herself
0.90
itſelf
0.90
himself
0.89
Anſ
0.88
Activations Density 0.556%