INDEX
Explanations
mentions of potential or the possibility of future events
New Auto-Interp
Negative Logits
ulu
-0.88
cipline
-0.82
antry
-0.80
zee
-0.78
lang
-0.78
thouse
-0.76
mson
-0.76
rea
-0.76
ĸļ
-0.74
toe
-0.74
POSITIVE LOGITS
future
1.00
successors
0.93
obstruction
0.92
threats
0.86
pitfalls
0.85
culprit
0.84
replacements
0.83
conflicts
0.83
sequel
0.82
unintended
0.81
Activations Density 0.051%