INDEX
    Explanations

    references to philosophical concepts and discussions

    New Auto-Interp
    Negative Logits
    illet
    -0.15
    èĶ
    -0.15
    ris
    -0.15
    .blit
    -0.14
    alent
    -0.14
    縮
    -0.14
    dash
    -0.14
    ity
    -0.14
    ownik
    -0.14
    rysler
    -0.14
    POSITIVE LOGITS
    osoph
    0.23
    ippi
    0.21
    оÑģоÑĦ
    0.20
    phil
    0.19
    osopher
    0.19
    ical
    0.17
    á»ģn
    0.16
    y
    0.15
    Soph
    0.15
    Phil
    0.15
    Act Density 0.014%

    No Known Activations