INDEX
Explanations
politely stated phrases or requests
expressions of apology and request for politeness
New Auto-Interp
Negative Logits
Liberty
-0.82
Caribbean
-0.75
Rockefeller
-0.73
NYC
-0.71
-0.70
telecom
-0.69
retirees
-0.68
Berm
-0.68
centrally
-0.68
Tourism
-0.67
POSITIVE LOGITS
âĢ
1.73
··
1.48
âĻ
1.27
?ãĢį
1.27
âĶĢ
1.22
âĶĢâĶĢ
1.22
âĢķ
1.22
.ãĢį
1.15
âĶĢâĶĢâĶĢâĶĢ
1.11
.</
1.10
Activations Density 0.640%