INDEX
Explanations
phrases related to actions or events being performed on something
verbs indicating actions related to societal issues or systemic failures
New Auto-Interp
Negative Logits
Founding
-0.69
assies
-0.62
descended
-0.60
ierre
-0.59
Falling
-0.58
CLS
-0.58
Anders
-0.56
Shining
-0.55
otaur
-0.55
reunited
-0.55
POSITIVE LOGITS
by
0.95
elsewhere
0.81
ynamic
0.79
.</
0.75
igated
0.75
prominently
0.75
unintention
0.74
through
0.74
inconsist
0.73
.–
0.73
Activations Density 0.335%