INDEX
Explanations
phrases related to discussions and analyses of various societal and cultural issues
New Auto-Interp
Negative Logits
.","
-0.54
..."
-0.53
�
-0.51
\"
-0.50
('-0.47
\"
-0.42
``(
-0.41
(-
-0.41
...
-0.41
-->
-0.41
POSITIVE LOGITS
resa
0.83
odore
0.81
xiety
0.67
Conclusion
0.65
withstanding
0.61
notations
0.61
uckland
0.55
foundland
0.54
chieve
0.54
ibliography
0.54
Activations Density 24.460%