INDEX
    Explanations

    innate desire for meaning

    New Auto-Interp
    Negative Logits
     an
    0.45
    0.45
    相关的
    0.44
    r
    0.44
    2
    0.43
    said
    0.42
    If
    0.42
    The
    0.41
    ني
    0.41
    zu
    0.41
    POSITIVE LOGITS
     own
    1.05
     Own
    0.72
     ability
    0.67
     innate
    0.61
     OWN
    0.59
     собственные
    0.58
    own
    0.57
     inability
    0.54
     disruptive
    0.52
     innovative
    0.52
    Act Density 0.008%

    No Known Activations