INDEX
    Explanations

    names of people or characters in a text

    pronouns and their associated forms in various contexts

    New Auto-Interp
    Negative Logits
    ̶
    -0.65
     consolidation
    -0.65
    eatures
    -0.62
    drivers
    -0.61
     corros
    -0.60
    %]
    -0.59
    critical
    -0.57
    steamapps
    -0.56
     unaff
    -0.56
    norm
    -0.56
    POSITIVE LOGITS
    neau
    0.87
    oya
    0.80
     Pradesh
    0.76
    nikov
    0.75
    chuk
    0.73
    orf
    0.73
    wu
    0.73
     Brothers
    0.73
    ofer
    0.73
    ippi
    0.72
    Act Density 0.240%

    No Known Activations