INDEX
    Explanations

    words related to scientific concepts and technical details

    terms related to explanations and underlying principles

    New Auto-Interp
    Negative Logits
    udi
    -0.61
    IFIED
    -0.61
    entimes
    -0.61
    irrel
    -0.60
    alone
    -0.60
    inka
    -0.59
    Ń·
    -0.58
    Ħ¢
    -0.58
    culosis
    -0.57
    irgin
    -0.57
    POSITIVE LOGITS
     behind
    1.49
    behind
    1.24
     involved
    1.10
     underpin
    1.07
     Behind
    1.02
     surrounding
    1.01
     workings
    1.00
     underlying
    0.92
     of
    0.87
     beneath
    0.85
    Act Density 0.323%

    No Known Activations