INDEX
    Explanations

    references to the concept of "empiricism" and aspects related to beauty

    New Auto-Interp
    Negative Logits
     Reſ
    -1.69
     pleaſure
    -1.61
     Houſe
    -1.61
     doubtnut
    -1.60
     itſelf
    -1.60
     myſelf
    -1.59
     ―――――
    -1.55
     Theſe
    -1.52
    ſelf
    -1.51
     ſeveral
    -1.50
    POSITIVE LOGITS
    0.91
     emp
    0.81
    ,
    0.80
    .
    0.77
    y
    0.71
    emp
    0.69
    h
    0.69
    ny
    0.68
    '
    0.68
     (
    0.68
    Act Density 0.432%

    No Known Activations