INDEX
Explanations
words related to information, investigation, or evidence
verbs related to communication, responses, and offerings
New Auto-Interp
Negative Logits
LU
-0.67
ugi
-0.64
sometimes
-0.62
rather
-0.62
loo
-0.62
probably
-0.62
unknown
-0.62
loving
-0.61
not
-0.60
beware
-0.60
POSITIVE LOGITS
any
1.95
anything
1.81
ANY
1.47
anywhere
1.41
nor
1.41
anyone
1.38
anybody
1.38
anymore
1.38
anything
1.26
slightest
1.22
Activations Density 0.436%