INDEX
    Explanations

    phrases or expressions indicative of awareness or understanding of specific information

    New Auto-Interp
    Negative Logits
     فريبيس
    -0.56
    Empereur
    -0.49
    İstinadlar
    -0.47
    -0.44
    intéress
    -0.44
    emper
    -0.43
    erapeu
    -0.43
    erçe
    -0.42
     invokingState
    -0.42
    writerow
    -0.42
    POSITIVE LOGITS
     knowledge
    1.24
    knowledge
    1.12
     Knowledge
    1.08
    Knowledge
    1.03
     KNOWLEDGE
    0.96
     Know
    0.92
     knowing
    0.90
    Know
    0.87
     conocimiento
    0.86
     know
    0.85
    Act Density 0.013%

    No Known Activations