INDEX
Explanations
phrases indicating the need for a change or action to be taken
New Auto-Interp
Negative Logits
hab
-0.73
BAT
-0.69
sole
-0.67
pet
-0.64
efe
-0.62
md
-0.59
eni
-0.58
currency
-0.57
chio
-0.57
hyde
-0.57
POSITIVE LOGITS
surely
1.08
congratulations
0.98
yeah
0.97
why
0.89
yes
0.84
shouldn
0.82
we
0.78
chances
0.77
logically
0.76
they
0.76
Activations Density 0.059%