INDEX
    Explanations

    phrases suggesting contradictions or nuances in character or societal roles

    New Auto-Interp
    Negative Logits
    aggi
    -0.15
     unfinished
    -0.15
    oka
    -0.15
    iaux
    -0.14
     Tro
    -0.14
    899
    -0.14
    ient
    -0.14
    629
    -0.13
    sek
    -0.13
    898
    -0.13
    POSITIVE LOGITS
     immune
    0.59
     inv
    0.49
    immune
    0.46
     immunity
    0.40
     bullet
    0.39
     Imm
    0.39
     imp
    0.38
    -imm
    0.37
     imm
    0.36
     immun
    0.34
    Act Density 0.359%

    No Known Activations