INDEX
Explanations
conjunctions, particularly the word "And."
New Auto-Interp
Head Attr Weights
0:0.17
1:0.13
2:0.06
3:0.04
4:0.04
5:0.07
6:0.04
7:0.02
8:0.17
9:0.08
10:0.05
11:0.07
Negative Logits
saf
-1.62
FIX
-1.47
Rh
-1.40
icent
-1.33
Boss
-1.33
ventures
-1.32
ocial
-1.30
Luxem
-1.29
Track
-1.28
eport
-1.27
POSITIVE LOGITS
iqueness
1.41
someday
1.33
ersion
1.32
dunno
1.32
Hebdo
1.27
nobody
1.26
rationality
1.26
Allah
1.23
Divinity
1.20
displ
1.18
Activations Density 0.034%