INDEX
Explanations
information in longer texts, mentioning various individuals, places, and activities with a focus on academic, legal, and technical contexts
phrases indicating positions or titles of academia or authority
New Auto-Interp
Head Attr Weights
0:0.04
1:0.02
2:0.18
3:0.05
4:0.29
5:0.04
6:0.02
7:0.02
8:0.10
9:0.11
10:0.04
11:0.02
Negative Logits
recapt
-1.41
curtains
-1.38
ptoms
-1.33
oğ
-1.31
dots
-1.29
uctions
-1.28
rentals
-1.27
��
-1.25
anqu
-1.25
roundup
-1.24
POSITIVE LOGITS
cius
1.47
Weather
1.47
staff
1.32
hematically
1.32
IEEE
1.29
Marx
1.24
Gutenberg
1.24
humanities
1.23
Degree
1.22
student
1.22
Activations Density 0.019%