INDEX
Explanations
lessons or insights from events or actions
New Auto-Interp
Negative Logits
occupancy
-0.75
endars
-0.68
lobb
-0.66
omin
-0.63
roid
-0.63
flush
-0.62
umbers
-0.62
contiguous
-0.61
trak
-0.61
rencies
-0.60
POSITIVE LOGITS
Learned
1.48
learned
1.22
Lear
1.16
learnt
1.16
lessons
1.10
lesson
1.10
taught
0.97
learn
0.97
Teach
0.94
ĸļ
0.90
Activations Density 0.039%