INDEX
    Explanations

    references to money and theft

    New Auto-Interp
    Negative Logits
    iment
    -0.20
     cap
    -0.15
    jah
    -0.15
    enty
    -0.15
    λεί
    -0.15
     disclosures
    -0.14
    zos
    -0.14
    alysis
    -0.14
    coe
    -0.14
    criptor
    -0.14
    POSITIVE LOGITS
     Germ
    0.16
    adir
    0.15
    ä»ģ
    0.15
     sat
    0.15
    lichkeit
    0.14
    ruit
    0.14
    /renderer
    0.14
    inton
    0.14
    ijke
    0.14
    ÑĦекÑĤив
    0.13
    Act Density 0.094%

    No Known Activations