INDEX
    Explanations

    references to specific names or titles starting with "De."

    New Auto-Interp
    Negative Logits
    fir
    -0.17
    ro
    -0.17
    rote
    -0.16
    rist
    -0.16
    f
    -0.15
    ra
    -0.15
    oub
    -0.15
    xa
    -0.15
    res
    -0.15
    wo
    -0.15
    POSITIVE LOGITS
    acon
    0.19
    žel
    0.19
     facto
    0.18
    oxy
    0.17
    construct
    0.17
    eds
    0.17
    constructed
    0.16
    initely
    0.16
     deal
    0.16
    anship
    0.16
    Act Density 0.051%

    No Known Activations