INDEX
Explanations
words related to action verbs and situational adjectives
phrases related to negative experiences or emotions
New Auto-Interp
Negative Logits
inspecting
-0.67
endeav
-0.63
retaining
-0.61
congratulated
-0.61
lication
-0.61
ledge
-0.61
stances
-0.60
Parables
-0.60
positioning
-0.59
browsing
-0.59
POSITIVE LOGITS
fruition
0.78
happening
0.78
guiActiveUnfocused
0.75
downhill
0.72
netflix
0.71
undone
0.70
ripple
0.70
ipples
0.69
outweigh
0.66
occurring
0.64
Activations Density 0.702%