INDEX
Explanations
phrases indicating a decision or judgment based on certain criteria
phrases that contain the word "based on"
New Auto-Interp
Negative Logits
vous
-0.81
bats
-0.77
ura
-0.74
Sport
-0.71
Lab
-0.71
zon
-0.70
Forge
-0.69
flush
-0.68
Bit
-0.68
Daddy
-0.66
POSITIVE LOGITS
behalf
0.99
principles
0.98
assumption
0.93
assumptions
0.88
hears
0.87
principle
0.85
observation
0.84
sheer
0.84
conjecture
0.82
observations
0.80
Activations Density 0.095%