INDEX
Explanations
lessons or teachings mentioned in the text
phrases that refer to lessons learned
New Auto-Interp
Negative Logits
berman
-0.70
conn
-0.68
eb
-0.67
urses
-0.65
flush
-0.64
ractive
-0.64
merce
-0.64
filing
-0.64
secut
-0.63
occupancy
-0.63
POSITIVE LOGITS
lessons
1.38
Learned
1.21
Lessons
1.17
lesson
1.07
learn
0.96
Lear
0.93
Learning
0.88
learnt
0.88
é¾įå¥ij士
0.83
Teach
0.82
Activations Density 0.007%