INDEX
    Explanations

    terms related to historical territories and political structures

    New Auto-Interp
    Negative Logits
    }{*}{
    -0.87
    batore
    -0.79
     الرياضيه
    -0.76
    ائج
    -0.74
     ligiloj
    -0.72
    nasse
    -0.71
    abetes
    -0.71
    ✨:
    -0.71
    vatar
    -0.70
    はじめに
    -0.70
    POSITIVE LOGITS
     auffi
    0.79
     ſon
    0.71
    めでとう
    0.71
     figliu
    0.70
     Monfieur
    0.69
     fratello
    0.69
     Llew
    0.67
     Jefus
    0.66
    0.65
     Reſ
    0.63
    Act Density 2.515%

    No Known Activations