INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .hero
    -0.08
    hero
    -0.08
    tn
    -0.08
     She
    -0.07
     she
    -0.07
    Inheritance
    -0.07
    .calc
    -0.07
    Her
    -0.07
     inheritance
    -0.07
     Med
    -0.07
    POSITIVE LOGITS
     учета
    0.08
     afrontar
    0.08
     dotyczą
    0.08
     удел
    0.08
     affront
    0.08
     долж
    0.08
     введ
    0.08
     запрещ
    0.08
     речь
    0.08
     geschafft
    0.08
    Act Density 0.131%

    No Known Activations