INDEX
Explanations
direct references to the reader using the word "You"
instances of the word "You."
New Auto-Interp
Negative Logits
wrapper
-0.63
itud
-0.62
theirs
-0.60
airs
-0.60
temp
-0.58
shore
-0.57
majority
-0.56
srfAttach
-0.55
ãĥ³ãĤ¸
-0.55
stemming
-0.55
POSITIVE LOGITS
're
1.15
've
1.08
'll
1.02
Gov
1.00
guessed
0.99
ngth
0.94
Tube
0.94
imar
0.91
guys
0.91
ths
0.90
Activations Density 0.108%