INDEX
Explanations
phrases that involve presenting options or outcomes
phrases indicating a dichotomy or multiple categories
New Auto-Interp
Negative Logits
ergy
-0.67
Beast
-0.66
Ire
-0.66
Andromeda
-0.65
uddin
-0.61
olini
-0.60
une
-0.60
rattled
-0.60
aukee
-0.60
DERR
-0.57
POSITIVE LOGITS
:-
1.10
%:
1.05
viz
1.02
:
0.92
simultaneously
0.91
):
0.90
:(
0.88
':
0.87
:#
0.84
depending
0.82
Activations Density 0.152%