INDEX
Explanations
descriptive phrases related to architectural structures
New Auto-Interp
Negative Logits
tsky
-0.73
å§«
-0.73
ppo
-0.72
Ĥİ
-0.66
zzo
-0.65
FIRE
-0.65
auga
-0.64
Blaze
-0.64
ongyang
-0.63
llan
-0.63
POSITIVE LOGITS
etermined
1.15
oubt
1.13
aunted
1.12
iscovered
1.09
oubtedly
1.07
irect
1.07
efined
1.03
epend
1.02
ploy
0.99
icated
0.99
Activations Density 0.046%