INDEX
Explanations
references to reminders or significant lessons learned
New Auto-Interp
Head Attr Weights
0:0.08
1:0.07
2:0.08
3:0.08
4:0.09
5:0.08
6:0.08
7:0.07
8:0.07
9:0.07
10:0.09
11:0.09
Negative Logits
undown
-1.79
amping
-1.77
Problem
-1.72
arters
-1.69
Discuss
-1.66
Ack
-1.65
anguage
-1.61
エル
-1.60
iven
-1.60
Translation
-1.59
POSITIVE LOGITS
FD
1.77
Sloan
1.75
Milton
1.64
Pilgrim
1.64
Mellon
1.60
Mechdragon
1.58
Mons
1.55
congrat
1.54
depos
1.50
Martha
1.48
Activations Density 0.000%