INDEX
Explanations
terms related to steepness, difficulty, or significant changes, often in a metaphorical context
New Auto-Interp
Head Attr Weights
0:0.03
1:0.02
2:0.10
3:0.07
4:0.02
5:0.04
6:0.10
7:0.17
8:0.04
9:0.03
10:0.07
11:0.25
Negative Logits
��
-1.37
��
-1.35
��
-1.26
acea
-1.24
ghazi
-1.23
��
-1.23
onymous
-1.11
pronouns
-1.08
arten
-1.05
Zeit
-1.02
POSITIVE LOGITS
reaching
1.16
(>
1.16
wow
1.08
iculty
1.07
gain
1.07
allows
1.06
considering
1.04
Availability
1.03
yielding
1.03
osures
1.01
Activations Density 0.003%