INDEX
    Explanations

    references to people and their various roles or characteristics

    New Auto-Interp
    Negative Logits
    ónico
    -0.20
    áct
    -0.18
    izador
    -0.17
    ecedor
    -0.17
    acted
    -0.17
    ipple
    -0.16
    éric
    -0.16
    ký
    -0.16
    kowski
    -0.16
    Schedulers
    -0.15
    POSITIVE LOGITS
    ova
    0.48
    kova
    0.43
    eva
    0.39
    anova
    0.38
    ová
    0.36
    nova
    0.35
    ková
    0.35
    ueva
    0.33
    кова
    0.33
    ова
    0.32
    Act Density 0.024%

    No Known Activations