INDEX
    Explanations

    statements of argumentation or claims made in a narrative or discourse

    New Auto-Interp
    Negative Logits
     vel
    -0.16
    dek
    -0.14
    agens
    -0.14
    egas
    -0.14
    eric
    -0.14
    von
    -0.13
     Ed
    -0.13
     ho
    -0.13
    ugu
    -0.13
    arga
    -0.13
    POSITIVE LOGITS
    _epi
    0.16
    omik
    0.15
    816
    0.15
    uild
    0.15
    ContentLoaded
    0.14
    ýt
    0.14
    rame
    0.14
    šak
    0.14
    %^
    0.14
    nez
    0.14
    Act Density 0.195%

    No Known Activations