INDEX
Explanations
phrases starting with "And."
occurrences of the word "And"
New Auto-Interp
Negative Logits
manship
-0.68
heads
-0.68
scene
-0.66
atical
-0.65
houses
-0.65
fell
-0.63
cloth
-0.62
=\"
-0.61
ses
-0.61
amer
-0.61
POSITIVE LOGITS
romeda
1.27
rea
1.16
hra
1.15
alus
0.96
secondly
0.94
rew
0.90
furthermore
0.88
then
0.87
rogens
0.86
ersen
0.86
Activations Density 0.088%