INDEX
    Explanations

    references to historical and cultural elements related to Japan

    New Auto-Interp
    Negative Logits
     Japanese
    -0.25
    Japanese
    -0.23
    æĹ¥æľ¬
    -0.22
     Jap
    -0.22
     Japan
    -0.22
     japanese
    -0.22
     japan
    -0.22
    Japan
    -0.21
     ÚĺØ§Ù¾
    -0.20
    ãĢģæĹ¥æľ¬
    -0.20
    POSITIVE LOGITS
     lord
    0.24
     clan
    0.23
     Å
    0.22
     Clan
    0.21
     Lord
    0.20
     clans
    0.20
     Domain
    0.20
     lords
    0.20
     sam
    0.19
     domain
    0.19
    Act Density 0.010%

    No Known Activations