INDEX
Explanations
seeking specific elaboration
New Auto-Interp
Negative Logits
often
0.95
keeps
0.91
oftentimes
0.88
doesn
0.87
also
0.85
sometimes
0.84
often
0.83
now
0.81
karena
0.80
omdat
0.80
POSITIVE LOGITS
Specific
1.25
Specific
1.19
подробно
1.16
വിശദ
1.14
Details
1.14
详细
1.12
Detailed
1.11
Details
1.11
How
1.09
Detail
1.08
Activations Density 0.113%