INDEX
    Explanations

    references to sources or origins in the text

    New Auto-Interp
    Negative Logits
     ſta
    -0.60
    RTGC
    -0.60
     Efq
    -0.56
     ſever
    -0.54
     quæ
    -0.54
    sidemargin
    -0.53
    Hauptartikel
    -0.53
     ſtre
    -0.52
     ſol
    -0.51
     otomatig
    -0.51
    POSITIVE LOGITS
     from
    0.66
    from
    0.53
     FROM
    0.47
    FROM
    0.46
     från
    0.45
     From
    0.44
    From
    0.43
    来自
    0.42
     dari
    0.41
     来自
    0.41
    Act Density 0.086%

    No Known Activations