INDEX
Explanations
words associated with justifying or providing reasons for actions or decisions
terms related to justification and rationale
New Auto-Interp
Negative Logits
berry
-0.78
grass
-0.76
ummer
-0.75
thumbnails
-0.74
semble
-0.72
along
-0.71
estone
-0.70
chron
-0.70
izen
-0.69
berries
-0.68
POSITIVE LOGITS
justifying
0.93
why
0.85
spending
0.77
inaction
0.77
justification
0.75
attribut
0.74
banning
0.71
justify
0.71
ably
0.70
WHY
0.70
Activations Density 0.040%