INDEX
Explanations
phrases related to understanding and acquiring knowledge or skills
New Auto-Interp
Negative Logits
260
-0.15
experienced
-0.14
352
-0.14
_defs
-0.14
phi
-0.14
Experienced
-0.14
logan
-0.14
ibar
-0.14
öst
-0.13
baum
-0.13
POSITIVE LOGITS
knowledge
0.67
knowledge
0.58
Knowledge
0.54
Knowledge
0.51
understanding
0.47
çŁ¥è¯Ĩ
0.43
awareness
0.38
Understanding
0.37
nowledge
0.35
conosc
0.35
Activations Density 0.216%