INDEX
    Explanations

    frequent occurrences of the word "the."

    New Auto-Interp
    Negative Logits
     spørs
    -0.65
     overras
    -0.64
     spørsmål
    -0.61
     Anſ
    -0.60
     Wikiseite
    -0.60
    anggung
    -0.60
    iſen
    -0.59
     juſ
    -0.59
     Reſ
    -0.59
     raiſ
    -0.59
    POSITIVE LOGITS
     of
    0.55
     OF
    0.55
     ofthe
    0.52
     của
    0.51
     Filip
    0.49
    OfThe
    0.49
     Acht
    0.47
    0.46
     Of
    0.45
    OfClass
    0.45
    Act Density 0.428%

    No Known Activations