INDEX
    Explanations

    sentences or phrases that express various ideas

    New Auto-Interp
    Negative Logits
    quiv
    -0.58
    yam
    -0.56
    ç
    -0.54
    verständlich
    -0.52
    thering
    -0.51
    </em>
    -0.50
    s
    -0.50
    -0.49
    тивы
    -0.49
     equili
    -0.48
    POSITIVE LOGITS
     ideas
    1.54
     IDEA
    1.54
    Idea
    1.49
     Ideas
    1.44
    Ideas
    1.44
     Idea
    1.42
    ideas
    1.39
     IDEAS
    1.28
    IDEA
    1.23
     idea
    1.22
    Act Density 0.056%

    No Known Activations