INDEX
Explanations
phrases related to physical positions and outcomes
New Auto-Interp
Negative Logits
indication
-0.67
heed
-0.64
lain
-0.64
intention
-0.61
contingency
-0.61
ongyang
-0.61
assurance
-0.61
grass
-0.61
Awareness
-0.60
suggestion
-0.60
POSITIVE LOGITS
èª
0.84
ãĥĩãĤ£
0.84
needing
0.81
deleting
0.78
liking
0.77
owning
0.77
accidentally
0.76
missing
0.76
encountering
0.75
trapped
0.74
Activations Density 0.693%