INDEX
    Explanations

    references to war and related concepts

    New Auto-Interp
    Negative Logits
     purpoſe
    -0.86
     ویکی‌پدیای
    -0.83
     faſt
    -0.82
     rechange
    -0.80
     Monfieur
    -0.78
     pitié
    -0.78
     pleaſure
    -0.76
     uſ
    -0.76
     ſtate
    -0.75
     tranſ
    -0.73
    POSITIVE LOGITS
     war
    1.01
     War
    0.79
     wars
    0.66
    tables
    0.66
     Tab
    0.61
     Wars
    0.61
    War
    0.61
     gen
    0.60
    tab
    0.59
    tably
    0.59
    Act Density 0.168%

    No Known Activations