INDEX
Explanations
instances of monetary values expressed in numbers
specific letters or characters appearing in the text
New Auto-Interp
Negative Logits
Moff
-0.60
Gamble
-0.53
Osw
-0.52
Gadget
-0.50
AE
-0.50
HIP
-0.46
commenting
-0.45
advertising
-0.43
EVs
-0.43
BMC
-0.42
POSITIVE LOGITS
arant
0.68
alin
0.67
otom
0.63
anche
0.61
iren
0.61
odox
0.61
omas
0.60
ioned
0.59
yth
0.59
roid
0.58
Activations Density 1.705%