INDEX
    Explanations

    instances of "first" and related phrases that indicate initial thoughts or experiences

    New Auto-Interp
    Negative Logits
     currently
    -0.17
     finally
    -0.16
    alli
    -0.16
    å·²
    -0.15
    illard
    -0.15
    elia
    -0.14
     ultimately
    -0.14
    缮åīį
    -0.14
     sonst
    -0.14
     now
    -0.14
    POSITIVE LOGITS
     initially
    0.29
     Initially
    0.25
    Initially
    0.24
     inicial
    0.20
     initial
    0.18
    initial
    0.18
    ulo
    0.16
    639
    0.16
    (initial
    0.16
     scept
    0.16
    Act Density 0.085%

    No Known Activations