INDEX
Explanations
words related to physical objects and actions
specific nouns and terms related to various subjects, especially in science, film, and cooking
New Auto-Interp
Negative Logits
referen
-0.89
vous
-0.77
arlane
-0.70
Masquerade
-0.66
ij士
-0.65
notor
-0.64
ciating
-0.63
proport
-0.62
redes
-0.61
اÙĦ
-0.60
POSITIVE LOGITS
imus
0.79
auts
0.70
rees
0.66
antis
0.66
forest
0.63
nets
0.62
bush
0.61
lines
0.61
rings
0.60
ombies
0.59
Activations Density 0.465%