INDEX
Explanations
explaining complex or crucial details
New Auto-Interp
Negative Logits
Both
0.92
Both
0.83
both
0.79
sự
0.70
både
0.69
മാക
0.67
です
0.67
Stiffness
0.66
Being
0.65
త్వం
0.65
POSITIVE LOGITS
sometimes
1.07
often
1.01
yet
0.99
oftentimes
0.96
somewhat
0.94
possibly
0.94
occasionally
0.89
perhaps
0.89
arguably
0.88
informative
0.88
Activations Density 1.064%