INDEX
Explanations
phrases prompting the reader to check out something
calls to action encouraging the reader to check out content or links
New Auto-Interp
Negative Logits
suscept
-0.83
reluct
-0.68
lishes
-0.65
mathemat
-0.63
inval
-0.62
cffffcc
-0.61
onte
-0.59
EStreamFrame
-0.59
OF
-0.59
ody
-0.58
POSITIVE LOGITS
mate
1.27
lists
1.01
out
1.00
boxes
0.97
list
0.92
out
0.92
points
0.87
mates
0.87
outs
0.85
point
0.78
Activations Density 0.017%