INDEX
Explanations
adjectives related to invisibility or hiddenness
references to the concept of invisibility
New Auto-Interp
Negative Logits
ortment
-0.81
xit
-0.76
ership
-0.75
ourses
-0.75
owder
-0.74
olitan
-0.74
inion
-0.73
nia
-0.73
andals
-0.73
aeper
-0.73
POSITIVE LOGITS
invisible
0.97
invis
0.93
volcano
0.85
Invisible
0.78
unseen
0.78
phantom
0.75
disco
0.74
invincible
0.72
worm
0.72
disappear
0.72
Activations Density 0.012%