INDEX
    Explanations

    mentions of sources or attributions in texts

    references to sources or citations in a document

    New Auto-Interp
    Negative Logits
    ucket
    -0.78
    raising
    -0.74
     frogs
    -0.69
    pad
    -0.67
    payer
    -0.66
    inary
    -0.66
    eared
    -0.65
    loss
    -0.65
    owl
    -0.65
    thin
    -0.65
    POSITIVE LOGITS
     Via
    1.02
    Via
    0.95
    via
    0.93
     Wikimedia
    0.90
    ulture
    0.78
    ultural
    0.77
     Email
    0.73
     Religion
    0.70
    ï¸ı
    0.67
    WARD
    0.66
    Act Density 0.009%

    No Known Activations