INDEX
Explanations
mentions of specific names or titles, particularly related to individuals and locations
Emir, Amir, Prince, nationalities, places
New Auto-Interp
Negative Logits
afternoon
-0.41
HasBeen
-0.41
kasarigan
-0.41
Jpa
-0.40
spe
-0.39
Barnes
-0.39
Potato
-0.38
Barna
-0.38
cabe
-0.38
Nen
-0.37
POSITIVE LOGITS
Emir
1.84
emir
1.62
Emir
1.23
Amir
1.20
Amir
1.07
amir
0.85
امیر
0.70
herzog
0.68
emirates
0.67
الأمير
0.62
Activations Density 0.001%