INDEX
    Explanations

    statements indicating existence or presence

    New Auto-Interp
    Negative Logits
     infatti
    -0.70
     a
    -0.68
     an
    -0.66
     olyan
    -0.62
     It
    -0.58
     is
    -0.56
     [
    -0.54
     The
    -0.54
     as
    -0.53
     một
    -0.52
    POSITIVE LOGITS
     myſelf
    1.03
     itſelf
    1.01
     leaſt
    1.01
     Efq
    0.98
     ―――――
    0.96
     Houſe
    0.93
     whoſe
    0.93
     uſed
    0.92
     Reſ
    0.92
     raiſ
    0.91
    Act Density 0.511%

    No Known Activations