INDEX
Explanations
references to statements or reporting
phrases indicating statements or claims made by individuals
New Auto-Interp
Negative Logits
DragonMagazine
-0.69
cigarettes
-0.64
Canaver
-0.63
ATER
-0.61
ENS
-0.59
cedented
-0.58
hematically
-0.58
ukong
-0.58
estern
-0.58
urable
-0.58
POSITIVE LOGITS
.).
0.79
).[
0.68
).
0.68
).
0.57
.}
0.57
].
0.55
%).
0.55
.[
0.54
)).
0.54
)."
0.53
Activations Density 2.512%