INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    Ħ¢
    -0.83
     disgusted
    -0.76
     unman
    -0.65
     civilisation
    -0.64
    VERT
    -0.64
     Passing
    -0.64
     ourselves
    -0.63
     policing
    -0.61
     AGA
    -0.61
     traumatic
    -0.61
    POSITIVE LOGITS
     GOODMAN
    0.82
     Constantin
    0.74
    fx
    0.72
    rich
    0.72
    eous
    0.71
     Brach
    0.71
     Hos
    0.70
    heng
    0.69
    isi
    0.69
    hya
    0.69
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.