INDEX
Explanations
adjectives or verbs related to something being distorted or misshapen
words related to distortion or manipulation of concepts or objects
New Auto-Interp
Negative Logits
alty
-0.85
ILA
-0.76
worthiness
-0.76
ciation
-0.75
ervation
-0.73
¯¯¯¯
-0.72
iphate
-0.72
particip
-0.71
upon
-0.70
Han
-0.69
POSITIVE LOGITS
twisted
1.17
twist
1.12
twisting
1.06
twists
1.00
Twisted
0.86
lengths
0.81
intrins
0.78
ulously
0.78
adolesc
0.74
endish
0.74
Activations Density 0.006%