INDEX
Explanations
mentions of historical, religious, and technical terms
New Auto-Interp
Negative Logits
Catalog
-0.78
istg
-0.74
igue
-0.71
ername
-0.70
/"
-0.69
ucci
-0.67
ashtra
-0.67
.''.
-0.66
.-
-0.66
way
-0.66
POSITIVE LOGITS
these
0.94
each
0.83
those
0.82
mankind
0.82
humankind
0.79
the
0.79
our
0.78
oneself
0.73
this
0.72
incumb
0.69
Activations Density 3.952%