INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     being
    -0.91
     (
    -0.65
    ,
    -0.61
    being
    -0.60
    .
    -0.54
     BEING
    -0.53
     and
    -0.52
     is
    -0.52
     Being
    -0.49
    Being
    -0.48
    POSITIVE LOGITS
     raiſ
    1.51
     Houſe
    1.49
     uſed
    1.48
     houſe
    1.45
     myſelf
    1.42
     Diſ
    1.41
     itſelf
    1.41
     ſtate
    1.39
     Monfieur
    1.39
     ſever
    1.38
    Act Density 0.027%

    No Known Activations