INDEX
Explanations
abstract or technical terms and specialized vocabulary
New Auto-Interp
Head Attr Weights
0:0.03
1:0.03
2:0.04
3:0.35
4:0.02
5:0.02
6:0.05
7:0.13
8:0.05
9:0.06
10:0.08
11:0.09
Negative Logits
rities
-1.27
ufact
-1.19
rences
-1.13
stocks
-1.10
letter
-1.09
child
-1.08
�
-1.08
bart
-1.05
anne
-1.03
advertising
-1.01
POSITIVE LOGITS
ippi
1.48
vous
1.32
emia
1.18
pter
1.12
ldon
1.03
agog
1.03
indo
1.03
OSH
1.00
glutamate
0.99
rha
0.97
Activations Density 0.003%