INDEX
    Explanations

    phrases with definite articles

    New Auto-Interp
    Negative Logits
     itſelf
    -1.01
     raiſ
    -0.94
     myſelf
    -0.92
     preſent
    -0.90
     Plenum
    -0.88
     chré
    -0.86
     themſelves
    -0.86
     pleaſure
    -0.86
     religieuses
    -0.84
     Efq
    -0.84
    POSITIVE LOGITS
     the
    1.19
     de
    1.02
     The
    0.95
     De
    0.83
    The
    0.76
    Οι
    0.75
     den
    0.75
      
    0.74
     Z
    0.73
    the
    0.73
    Act Density 0.022%

    No Known Activations