INDEX
Explanations
Turkish words and phrases
fragments of words or syllables, likely indicating a focus on language patterns and morphology
New Auto-Interp
Negative Logits
Rhodes
-1.06
Wilmington
-0.88
Rouge
-0.86
Cruise
-0.84
Belle
-0.83
Scorp
-0.83
Essex
-0.82
Blacks
-0.80
Charleston
-0.80
Windsor
-0.79
POSITIVE LOGITS
ı
2.18
ÄŁ
1.96
ÅŁ
1.92
oÄŁ
1.56
stanbul
1.54
Ãĸ
1.49
ç
1.46
Erd
1.45
oÄŁan
1.42
lar
1.39
Activations Density 0.111%