INDEX
Explanations
also; additionally; moreover
New Auto-Interp
Negative Logits
↵↵↵
1.01
↵↵↵↵↵↵↵↵↵
0.96
↵↵↵↵
0.95
$-$
0.95
↵↵↵↵↵↵↵
0.92
↵↵↵↵↵
0.89
↵↵↵↵↵↵
0.88
↵↵↵↵↵↵↵↵
0.86
↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵
0.84
↵↵↵↵↵↵↵↵↵↵↵↵↵
0.83
POSITIVE LOGITS
Also
0.75
Also
0.75
Moreover
0.72
Inoltre
0.72
Additionally
0.72
أيضا
0.70
Additionally
0.69
Furthermore
0.68
Moreover
0.68
Ayrıca
0.67
Activations Density 1.236%