INDEX
Explanations
instances of the word "cut," indicating a focus on cutting or related actions
New Auto-Interp
Negative Logits
enta
-0.16
oles
-0.16
cko
-0.15
oi
-0.15
eca
-0.15
ity
-0.15
istor
-0.14
Hanna
-0.14
annes
-0.14
ouns
-0.13
POSITIVE LOGITS
aneous
0.25
rippling
0.20
0.19
ãĤ¥
0.19
cut
0.18
ting
0.17
rell
0.17
-cut
0.17
=cut
0.17
throat
0.17
Activations Density 0.022%