INDEX
Explanations
repeated instances of the word "Hop" or its variations
New Auto-Interp
Negative Logits
divider
-0.18
hand
-0.17
aments
-0.17
tte
-0.17
ature
-0.16
aphore
-0.15
icator
-0.15
nder
-0.15
inner
-0.15
callee
-0.15
POSITIVE LOGITS
kins
0.24
py
0.24
itals
0.22
croft
0.21
portunity
0.20
kinson
0.19
ital
0.18
Hop
0.18
Hop
0.18
inion
0.17
Activations Density 0.010%