INDEX
    Explanations

    references to written materials or annotations, particularly "notes."

    New Auto-Interp
    Negative Logits
    ArrowToggle
    -0.70
     transfieras
    -0.58
    >>>
    -0.57
    ovala
    -0.56
    															
    -0.55
    osexuality
    -0.53
    Reich
    -0.52
    -0.52
    inac
    -0.51
     responsibility
    -0.51
    POSITIVE LOGITS
    بوابة
    0.71
    equity
    0.69
    notes
    0.68
     equity
    0.66
     Letras
    0.65
    SPATH
    0.65
     notes
    0.64
    UVWXYZ
    0.63
     pills
    0.63
     wits
    0.63
    Act Density 0.063%

    No Known Activations