INDEX
    Explanations

    the beginnings of lists or enumerated points

    phrases that introduce numbered lists or points

    New Auto-Interp
    Negative Logits
    orian
    -0.77
    urated
    -0.69
    orate
    -0.62
    estern
    -0.62
     awaits
    -0.58
    idia
    -0.58
    uded
    -0.57
    atsu
    -0.56
    ascus
    -0.56
    azel
    -0.56
    POSITIVE LOGITS
     Firstly
    1.83
    Firstly
    1.71
     First
    1.52
    First
    1.44
     first
    1.26
    first
    1.17
    1
    1.12
     FIRST
    1.10
     1
    1.05
     Number
    0.94
    Act Density 0.311%

    No Known Activations