INDEX
Explanations
the word "lessons"
references to learning or teachings
repeated references to lessons learned
New Auto-Interp
Negative Logits
berman
-0.67
conn
-0.65
flush
-0.62
blob
-0.62
ractive
-0.61
riot
-0.60
secut
-0.60
occupancy
-0.59
represented
-0.59
pub
-0.58
POSITIVE LOGITS
lessons
1.25
Learned
1.12
Lessons
1.07
lesson
0.92
learnt
0.85
Teach
0.84
chool
0.83
Lear
0.82
learn
0.80
ĸļ
0.78
Activations Density 0.011%