INDEX
    Explanations

    Turkish words and phrases

    fragments of words or syllables, likely indicating a focus on language patterns and morphology

    New Auto-Interp
    Negative Logits
     Rhodes
    -1.06
     Wilmington
    -0.88
     Rouge
    -0.86
     Cruise
    -0.84
     Belle
    -0.83
     Scorp
    -0.83
     Essex
    -0.82
     Blacks
    -0.80
     Charleston
    -0.80
     Windsor
    -0.79
    POSITIVE LOGITS
    ı
    2.18
    ÄŁ
    1.96
    ÅŁ
    1.92
    oÄŁ
    1.56
    stanbul
    1.54
     Ãĸ
    1.49
    ç
    1.46
     Erd
    1.45
    oÄŁan
    1.42
    lar
    1.39
    Act Density 0.111%

    No Known Activations