INDEX
Explanations
references to physical size or intensity
instances of the word "giant" in various contexts
New Auto-Interp
Negative Logits
yrinth
-0.92
iggins
-0.83
endment
-0.82
qi
-0.77
blance
-0.76
rences
-0.76
nc
-0.75
iring
-0.75
earchers
-0.74
tein
-0.73
POSITIVE LOGITS
squid
0.93
chunk
0.81
gest
0.78
monster
0.78
conglomer
0.77
gorilla
0.77
leap
0.75
Slayer
0.72
eyeb
0.72
titan
0.71
Activations Density 0.017%