INDEX
Explanations
linguistic features that indicate quotation marks or speech elements
New Auto-Interp
Head Attr Weights
0:0.03
1:0.03
2:0.11
3:0.17
4:0.02
5:0.04
6:0.06
7:0.23
8:0.06
9:0.08
10:0.05
11:0.07
Negative Logits
\\\\\\\\
-1.46
Lash
-1.28
SourceFile
-1.23
َ
-1.16
theless
-1.14
imar
-1.13
jandro
-1.11
Salam
-1.11
Samp
-1.10
helm
-1.10
POSITIVE LOGITS
anium
1.31
poons
1.18
ymm
1.17
usefulness
1.14
phies
1.12
constituents
1.11
vich
1.11
portfolio
1.07
inas
1.06
IPO
1.06
Activations Density 0.002%