INDEX
Explanations
gerunds and past participles related to actions
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.07
3:0.05
4:0.13
5:0.02
6:0.03
7:0.35
8:0.02
9:0.04
10:0.16
11:0.04
Negative Logits
requ
-1.84
uctions
-1.67
ulla
-1.66
ylum
-1.66
iques
-1.63
pkg
-1.62
aceae
-1.58
rolet
-1.54
lov
-1.53
ones
-1.51
POSITIVE LOGITS
gloom
2.05
daylight
1.85
fireball
1.76
Divinity
1.73
magnification
1.70
cursor
1.70
glow
1.69
accurately
1.67
mir
1.66
illusion
1.65
Activations Density 0.001%