INDEX
Explanations
words related to strong interests or fixations
terms related to intense fixation or preoccupation
New Auto-Interp
Negative Logits
ģ«
-0.67
Equality
-0.65
Medic
-0.65
stood
-0.64
stand
-0.62
neau
-0.60
degradation
-0.60
gs
-0.60
izontal
-0.60
GG
-0.59
POSITIVE LOGITS
ishly
0.89
fascination
0.87
iously
0.80
obs
0.80
fascinated
0.78
MJ
0.75
meric
0.73
obsession
0.72
atically
0.72
obsess
0.72
Activations Density 0.070%