INDEX
    Explanations

    negations and expressions of personal experience or belief

    New Auto-Interp
    Negative Logits
    ss
    -0.16
    uide
    -0.15
    IONS
    -0.14
    SS
    -0.14
    ij
    -0.14
     mainly
    -0.14
    oton
    -0.14
    isset
    -0.13
    eza
    -0.13
    main
    -0.13
    POSITIVE LOGITS
    å͝ä¸Ģ
    0.34
     един
    0.31
     unique
    0.26
     jedin
    0.24
     alone
    0.24
     einz
    0.24
    unique
    0.22
     único
    0.22
     earliest
    0.22
     única
    0.22
    Act Density 0.206%

    No Known Activations