INDEX
Explanations
phrases related to justifications and conditions
New Auto-Interp
Negative Logits
growing
-0.77
Increasing
-0.76
Growing
-0.75
increasing
-0.74
Changing
-0.74
changing
-0.73
Growing
-0.73
growing
-0.71
Increasing
-0.71
Changing
-0.70
POSITIVE LOGITS
saying
1.66
stating
1.47
noting
1.43
suggesting
1.37
saying
1.35
mentioning
1.35
wondering
1.32
describing
1.31
asking
1.27
thinking
1.26
Activations Density 0.493%