INDEX
Explanations
telling stories and anecdotes
New Auto-Interp
Negative Logits
oscillating
0.43
adjustable
0.38
المصطلح
0.38
argued
0.37
oscillations
0.37
inspected
0.36
parameters
0.35
സമ
0.35
অগ্রসর
0.34
utili
0.34
POSITIVE LOGITS
stories
2.48
stories
2.22
Stories
2.19
Stories
2.14
story
2.11
racconto
2.09
STORIES
2.09
anecdotes
2.03
故事
2.02
storie
2.02
Activations Density 0.040%