INDEX
    Explanations

    expressing i or we statements

    New Auto-Interp
    Negative Logits
    하였다
    0.61
    0.56
     하였다
    0.54
    하였습니다
    0.52
    原来
    0.52
     एवं
    0.52
    ത്വം
    0.52
    使其
    0.51
    하였
    0.50
    原來
    0.48
    POSITIVE LOGITS
     definitely
    0.79
     aren
    0.78
     KNOW
    0.73
     know
    0.73
     certainly
    0.73
    确实
    0.72
     knows
    0.71
     isn
    0.67
    know
    0.64
     recognize
    0.62
    Act Density 0.114%

    No Known Activations