INDEX
Explanations
mentions of stalking behavior
terms related to stalking and creepy behavior
New Auto-Interp
Negative Logits
immersion
-0.74
ption
-0.72
gae
-0.68
ilts
-0.68
andum
-0.68
itars
-0.67
ptive
-0.67
iyah
-0.66
ederation
-0.65
Rite
-0.65
POSITIVE LOGITS
Creep
1.09
stalking
1.01
stalk
0.96
ingly
0.91
crow
0.90
ritch
0.81
lords
0.79
creep
0.79
Detect
0.77
harass
0.75
Activations Density 0.037%