INDEX
Explanations
statements or descriptions related to learning experiences or gaining knowledge
statements about learning or lessons learned from experiences
New Auto-Interp
Negative Logits
endez
-0.65
backdrop
-0.63
è£ıè¦ļéĨĴ
-0.62
yip
-0.62
chairs
-0.61
pex
-0.60
ItemImage
-0.60
ecake
-0.58
Constructed
-0.57
Buckingham
-0.57
POSITIVE LOGITS
lesson
1.57
lessons
1.39
valuable
1.12
invaluable
1.11
Lessons
1.05
tricks
0.98
firsthand
0.92
Learned
0.91
wisdom
0.83
trick
0.83
Activations Density 0.150%