INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     knowledge
    -2.59
    knowledge
    -2.13
     Knowledge
    -1.96
    Knowledge
    -1.93
     KNOWLEDGE
    -1.88
     conocimiento
    -1.76
     conhecimento
    -1.64
     knowled
    -1.48
     conocimientos
    -1.38
    知识
    -1.34
    POSITIVE LOGITS
     of
    0.69
    /
    0.55
     about
    0.54
     or
    0.51
    base
    0.51
     for
    0.50
     in
    0.50
     and
    0.49
    principalTable
    0.48
    ,
    0.48
    Act Density 0.077%

    No Known Activations