INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    	I
    -0.07
    анні
    -0.07
     eag
    -0.07
     един
    -0.07
    แชม
    -0.07
     bear
    -0.06
     wife
    -0.06
    мі
    -0.06
     สำ
    -0.06
     justification
    -0.06
    POSITIVE LOGITS
     concept
    0.23
     Concept
    0.17
     concepts
    0.15
    concept
    0.13
    Concept
    0.11
     Concepts
    0.10
    CEPT
    0.09
    概念
    0.09
    cepts
    0.07
     conceptual
    0.07
    Act Density 0.012%

    No Known Activations