INDEX
Explanations
phrases indicating a function or usefulness of an object or action
phrases indicating functionality or purpose
New Auto-Interp
Negative Logits
inspected
-0.68
gins
-0.59
opher
-0.58
uesday
-0.58
Obj
-0.58
acqu
-0.57
ti
-0.57
pex
-0.57
ibliography
-0.56
ipl
-0.54
POSITIVE LOGITS
purpose
0.72
umes
0.69
primarily
0.69
mainly
0.68
icer
0.68
as
0.68
purposes
0.67
ibaba
0.66
interests
0.66
agonists
0.66
Activations Density 0.032%