INDEX
Explanations
words related to knowledge or understanding
the conjunctions "and" and "or."
New Auto-Interp
Negative Logits
Squid
-0.67
heit
-0.65
MET
-0.63
steroids
-0.60
Jackets
-0.59
nc
-0.58
Rainbow
-0.58
Ens
-0.58
Copper
-0.57
Arms
-0.56
POSITIVE LOGITS
rogen
0.84
romeda
0.80
rogens
0.79
analyse
0.73
diagnose
0.73
smelled
0.71
hear
0.70
ryn
0.70
dden
0.70
sound
0.70
Activations Density 0.175%