INDEX
Explanations
phrases related to the intention or purpose of an action towards a specific target
phrases indicating intent or purpose
New Auto-Interp
Negative Logits
IMAGES
-0.64
士
-0.63
anon
-0.62
Joined
-0.61
Situation
-0.60
Transcript
-0.59
etc
-0.59
Frie
-0.58
Operating
-0.57
apt
-0.54
POSITIVE LOGITS
squarely
1.21
specifically
0.91
at
0.90
towards
0.88
toward
0.86
aimed
0.86
primarily
0.84
solely
0.81
principally
0.80
mainly
0.78
Activations Density 0.056%