INDEX
Explanations
adjectives indicating size or scale
phrases referring to significant quantities or sizes
New Auto-Interp
Negative Logits
ysc
-0.74
uay
-0.74
Origins
-0.72
rick
-0.72
flies
-0.67
np
-0.67
Scully
-0.64
Pigs
-0.64
Annotations
-0.63
Blocks
-0.63
POSITIVE LOGITS
thing
1.18
scenario
0.96
delicate
0.93
sensitive
0.92
huge
0.90
feat
0.89
drastic
0.88
situation
0.88
diverse
0.83
complicated
0.83
Activations Density 0.031%