INDEX
    Explanations

    assertions and statements of fact

    New Auto-Interp
    Negative Logits
    essex
    -0.45
    fficio
    -0.40
    unanje
    -0.40
    -0.40
     •
    -0.40
    IMENTAL
    -0.39
    pictureBox
    -0.39
    vábbi
    -0.38
    Devon
    -0.37
     ・
    -0.36
    POSITIVE LOGITS
    1.80
    이다
    1.05
     다
    1.00
     있다
    1.00
    한다
    0.98
    하다
    0.97
    니다
    0.94
     한다
    0.90
    했다
    0.71
    었다
    0.71
    Act Density 0.004%

    No Known Activations