INDEX
Explanations
pronoun followed by modal verb or conjunction
New Auto-Interp
Negative Logits
Combat
0.83
combat
0.81
Combat
0.79
yesterday
0.78
weaponry
0.76
আজকে
0.75
ដូច្នេះ
0.74
nobility
0.74
tomorrow
0.72
warfare
0.72
POSITIVE LOGITS
selalu
1.09
sering
0.98
sempre
0.98
আরো
0.94
pernah
0.91
ยัง
0.90
активно
0.88
često
0.88
直接
0.88
逐渐
0.87
Activations Density 0.010%