INDEX
Explanations
names and their corresponding sentiments in narrative contexts
followed by apostrophes or quotation marks
foreign names or words
New Auto-Interp
Negative Logits
OGND
-1.08
nakalista
-0.91
$")
-0.79
AndroidJUnit
-0.79
BibitemShut
-0.74
مشين
-0.72
kasarigan
-0.72
")}
-0.70
__":
-0.70
)++;
-0.70
POSITIVE LOGITS
’
0.68
'
0.64
who
0.64
had
0.62
and
0.55
was
0.54
himself
0.54
did
0.53
knew
0.53
,
0.53
Activations Density 0.450%