INDEX
    Explanations

    language and social structures

    New Auto-Interp
    Negative Logits
    0.42
    0.41
    رر
    0.39
    я
    0.39
    gladbach
    0.39
     Initially
    0.38
    0.38
     запах
    0.38
    0.38
    0.38
    POSITIVE LOGITS
     language
    0.52
    Language
    0.50
    0.48
     viya
    0.47
     americanos
    0.46
     Traveller
    0.46
     texts
    0.45
     writings
    0.45
     язык
    0.45
     Language
    0.45
    Act Density 0.013%

    No Known Activations