INDEX
    Explanations

    references to life events or conditions

    New Auto-Interp
    Negative Logits
    rove
    -0.17
    vil
    -0.17
    IGO
    -0.15
     دÙĩ
    -0.15
     Ku
    -0.15
    и
    -0.15
    ève
    -0.15
    rib
    -0.14
     warm
    -0.14
    otte
    -0.14
    POSITIVE LOGITS
    inha
    0.17
    utto
    0.15
    hangi
    0.15
    itzer
    0.14
    iri
    0.14
     Casual
    0.14
    ibold
    0.14
    adoo
    0.14
    imers
    0.14
    antan
    0.14
    Act Density 0.000%

    No Known Activations