INDEX
    Explanations

    reflections and self-perception in various contexts

    New Auto-Interp
    Negative Logits
     martyr
    -0.17
     Municipal
    -0.15
    roj
    -0.15
    theid
    -0.15
    uds
    -0.14
    .asc
    -0.14
    ufs
    -0.14
    osate
    -0.13
    sst
    -0.13
    acock
    -0.13
    POSITIVE LOGITS
     mirror
    0.80
     mirrors
    0.76
     Mirror
    0.68
    mirror
    0.68
    Mirror
    0.63
     Mir
    0.58
     reflection
    0.58
    éķľ
    0.56
     mirrored
    0.56
     mir
    0.56
    Act Density 0.064%

    No Known Activations