INDEX
    Explanations

    phrases that indicate the beginning of sentences

    New Auto-Interp
    Negative Logits
    ght
    -0.16
    erd
    -0.15
    ccion
    -0.15
    variants
    -0.15
    elah
    -0.15
    erald
    -0.15
    berra
    -0.15
    athan
    -0.14
    ero
    -0.14
    rt
    -0.14
    POSITIVE LOGITS
     last
    0.18
     present
    0.18
     first
    0.17
    uly
    0.17
    SAME
    0.17
    woord
    0.16
    ventus
    0.15
    testing
    0.15
    wood
    0.15
    contr
    0.15
    Act Density 0.054%

    No Known Activations