INDEX
    Explanations

    references to authors and citations in academic texts

    New Auto-Interp
    Negative Logits
     Efq
    -1.06
     themſelves
    -1.00
     myſelf
    -0.97
     houſe
    -0.96
     itſelf
    -0.96
     disambiguazione
    -0.93
     perſon
    -0.93
     Jefus
    -0.91
     Eſ
    -0.91
     Majefty
    -0.91
    POSITIVE LOGITS
     der
    0.56
     von
    0.55
     {}",
    0.54
     cu
    0.54
     col
    0.53
     über
    0.53
    {}",
    0.52
     Vanden
    0.52
     pel
    0.52
     ri
    0.51
    Act Density 0.033%

    No Known Activations