INDEX
Explanations
personal pronouns followed by verbs
the word "You."
New Auto-Interp
Negative Logits
srfAttach
-0.64
Lago
-0.62
shore
-0.60
ammon
-0.59
ensable
-0.58
stemming
-0.58
ice
-0.58
icy
-0.57
actic
-0.57
ipal
-0.57
POSITIVE LOGITS
're
1.38
've
1.21
'll
1.19
guys
1.06
'd
0.97
tub
0.96
ths
0.88
ldon
0.87
ngth
0.85
know
0.84
Activations Density 0.104%