INDEX
    Explanations

    phrases indicating prior actions or events

    New Auto-Interp
    Negative Logits
     reaſon
    -0.77
     pleaſure
    -0.77
     ſeveral
    -0.77
     Jefus
    -0.75
    ſelf
    -0.74
     purpoſe
    -0.70
     raiſ
    -0.69
     Conſ
    -0.69
    ſelves
    -0.68
     ſtate
    -0.68
    POSITIVE LOGITS
     before
    0.94
     sebelum
    0.94
     πριν
    0.83
     bevor
    0.82
    before
    0.82
     før
    0.81
    BEFORE
    0.79
     BEFORE
    0.74
     Sebelum
    0.74
     voordat
    0.74
    Act Density 0.237%

    No Known Activations