INDEX
    Explanations

    references to principles and foundational concepts

    New Auto-Interp
    Negative Logits
    gie
    -0.19
    akan
    -0.16
    NESS
    -0.15
    itude
    -0.15
    ney
    -0.15
    ÑĢÑĥ
    -0.15
    lord
    -0.14
    ropolis
    -0.14
    504
    -0.14
    raham
    -0.14
    POSITIVE LOGITS
    -agent
    0.30
    ities
    0.24
     investigator
    0.21
    -Agent
    0.19
    ps
    0.19
     Investig
    0.18
    stown
    0.18
    pal
    0.18
    /ss
    0.17
    ized
    0.16
    Act Density 0.020%

    No Known Activations