INDEX
Explanations
mentions of actions being taken or decisions being made by someone
narratives about challenges and resolutions in storytelling
New Auto-Interp
Negative Logits
³³³³³³³³³³³³³³³³
-0.71
>>>>>>>>
-0.70
âĢ
-0.69
His
-0.68
ã
-0.68
´
-0.68
âĢ
-0.68
izons
-0.66
)</
-0.66
ende
-0.66
POSITIVE LOGITS
themselves
1.29
their
1.19
theirs
1.13
their
1.04
THEIR
1.00
they
0.92
Their
0.77
Their
0.75
they
0.72
apiece
0.71
Activations Density 0.832%