INDEX
Explanations
instances of attempted actions
nouns and phrases that indicate attempts, actions, or outcomes
New Auto-Interp
Negative Logits
Fury
-0.71
Apocalypse
-0.67
Everest
-0.67
Helsinki
-0.67
Bark
-0.66
Sheridan
-0.64
Rogue
-0.64
ahime
-0.63
Slaughter
-0.63
Pixie
-0.61
POSITIVE LOGITS
including
0.93
besides
0.92
imaginable
0.91
simultaneously
0.85
redund
0.82
cies
0.78
prises
0.76
tags
0.76
kat
0.75
clerosis
0.74
Activations Density 0.393%