INDEX
Explanations
mentions of reading or recommendations to read
instances of the word "read."
New Auto-Interp
Negative Logits
pload
-0.82
ascal
-0.79
xon
-0.74
ortality
-0.74
ño
-0.71
unity
-0.71
ugal
-0.70
afort
-0.69
Truth
-0.68
VC
-0.68
POSITIVE LOGITS
aloud
1.25
just
1.03
comprehension
0.96
ied
0.85
ying
0.84
dress
0.81
read
0.78
ahead
0.76
reads
0.74
iness
0.73
Activations Density 0.030%