INDEX
Explanations
phrases indicating the act of acquiring new knowledge or skills
instances of the word "learn."
New Auto-Interp
Negative Logits
ded
-0.72
pled
-0.68
AMI
-0.68
�
-0.67
Dak
-0.67
issions
-0.64
tightly
-0.64
adal
-0.64
headed
-0.63
berman
-0.63
POSITIVE LOGITS
learn
1.07
Lear
1.03
¿½
0.90
Learning
0.87
ĨĴ
0.86
Learn
0.86
Learn
0.81
ģ«
0.81
learns
0.79
learn
0.76
Activations Density 0.018%