INDEX
    Explanations

    references to self or personal involvement

    New Auto-Interp
    Negative Logits
    atch
    -0.16
    .FontStyle
    -0.15
    illos
    -0.15
     Už
    -0.15
    mond
    -0.14
    ond
    -0.14
    AAAA
    -0.14
    IMAL
    -0.13
    immel
    -0.13
    qui
    -0.13
    POSITIVE LOGITS
    -même
    0.25
    zelf
    0.24
    zÅij
    0.16
    elves
    0.15
    362
    0.15
    enger
    0.15
     Executor
    0.14
    ikat
    0.14
    rollo
    0.14
    ÑĩаÑģно
    0.14
    Act Density 0.072%

    No Known Activations