INDEX
Explanations
phrases and words associated with tearing or breaking apart
New Auto-Interp
Negative Logits
ials
-0.17
ãĥ³ãĥij
-0.17
McGr
-0.15
eza
-0.14
_PTR
-0.14
ä¼ı
-0.14
DG
-0.14
hat
-0.14
ogy
-0.13
clar
-0.13
POSITIVE LOGITS
apart
0.46
Apart
0.35
Apart
0.28
Tear
0.28
torn
0.28
tear
0.28
tearing
0.27
tore
0.26
-ap
0.23
ripped
0.22
Activations Density 0.017%