INDEX
    Explanations

    references to the concept of being newly introduced or initiated into a situation

    New Auto-Interp
    Negative Logits
     Newman
    -0.15
    antas
    -0.14
    ÑģÑĤа
    -0.14
    andal
    -0.14
    orias
    -0.14
    manship
    -0.13
    anse
    -0.13
    ãĥ³ãĥĢ
    -0.13
    uder
    -0.13
    META
    -0.13
    POSITIVE LOGITS
    ãģ°ãģĭãĤĬ
    0.20
    mitter
    0.16
     newly
    0.16
    swire
    0.16
     arrived
    0.15
     entr
    0.15
    нез
    0.15
    åīĽ
    0.15
    ertz
    0.15
    ippi
    0.14
    Act Density 0.147%

    No Known Activations