INDEX
Explanations
instances of reported speech or quotations from individuals
New Auto-Interp
Negative Logits
ãĥªãĥ¼
-0.15
agem
-0.15
stan
-0.14
ARIANT
-0.14
ston
-0.14
RAT
-0.13
Ĥ¨
-0.13
åĪ
-0.13
and
-0.13
zw
-0.13
POSITIVE LOGITS
quoting
0.18
chine
0.16
ocl
0.16
chaft
0.15
-quote
0.15
ormal
0.14
SetBranch
0.14
jspx
0.14
cairo
0.14
лоÑĩ
0.14
Activations Density 0.056%