INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Harcourt
    0.85
    ]),
    0.80
    مال
    0.78
    ใจ
    0.77
    менно
    0.75
    Sans
    0.74
     Pří
    0.74
    زاء
    0.73
    )],
    0.72
     }()
    0.72
    POSITIVE LOGITS
     notions
    1.41
     notion
    1.31
     conception
    0.97
     idea
    0.96
     concepts
    0.92
     idée
    0.91
     conceptions
    0.87
     altında
    0.86
     concept
    0.85
     Notion
    0.85
    Act Density 0.033%

    No Known Activations