INDEX
Explanations
phrases related to causal relationships and implications based on varying conditions
phrases related to environmental impacts and their consequences
New Auto-Interp
Negative Logits
Phill
-0.71
afety
-0.68
usp
-0.68
Elijah
-0.65
stros
-0.63
unlaw
-0.62
akespe
-0.61
revocation
-0.61
disse
-0.60
particulars
-0.60
POSITIVE LOGITS
than
1.29
healthier
1.08
fewer
1.02
richer
1.00
higher
1.00
decreased
1.00
quicker
0.99
faster
0.98
stronger
0.97
Decre
0.96
Activations Density 1.101%