INDEX
Explanations
references to social roles and interactions within communities
Follows "which", "it", "that", or "also"
New Auto-Interp
Negative Logits
Adds
-1.04
Removes
-0.96
Removes
-0.95
Sends
-0.94
Adds
-0.94
Goes
-0.93
Takes
-0.92
Gives
-0.92
Brings
-0.91
Theſe
-0.91
POSITIVE LOGITS
also
0.47
brigens
0.46
end
0.45
resultCode
0.45
dis
0.44
lords
0.44
antaranya
0.43
directly
0.43
letzt
0.43
properly
0.43
Activations Density 1.026%