INDEX
Explanations
expressions related to visual perception
instances of perception verbs related to seeing or observing
New Auto-Interp
Negative Logits
hero
-0.69
Chev
-0.61
Miko
-0.60
Dism
-0.60
aston
-0.59
reply
-0.59
gart
-0.59
agraph
-0.58
flood
-0.58
resur
-0.57
POSITIVE LOGITS
ãĤ´
0.75
entails
0.73
Learned
0.67
Ca
0.67
unfolding
0.66
ALD
0.65
UFC
0.64
ulse
0.64
atel
0.64
ulin
0.64
Activations Density 0.240%