INDEX
Explanations
occurrences of the word "it."
New Auto-Interp
Negative Logits
ishly
-0.19
initely
-0.16
hand
-0.16
onna
-0.15
sylvania
-0.15
rig
-0.15
morgan
-0.14
\common
-0.14
edList
-0.14
-eslint
-0.14
POSITIVE LOGITS
iner
0.34
/her
0.27
chy
0.26
zelf
0.26
inerary
0.26
/th
0.25
unes
0.23
self
0.23
anium
0.20
ches
0.20
Activations Density 0.176%