INDEX
Explanations
mentions of close relationships and intimacy
New Auto-Interp
Negative Logits
orsch
-0.16
jak
-0.16
Aws
-0.15
èµ·ãģĵ
-0.15
ello
-0.15
atee
-0.15
IFIC
-0.15
entric
-0.15
Aviv
-0.15
üstü
-0.14
POSITIVE LOGITS
proximity
0.27
knit
0.23
/close
0.23
-caption
0.22
clos
0.21
(close
0.21
close
0.20
close
0.20
-quarters
0.19
Close
0.19
Activations Density 0.026%