INDEX
Explanations
words and phrases related to disconnection or detachment
terms related to social connections and disconnections
New Auto-Interp
Negative Logits
gg
-0.71
gio
-0.70
GB
-0.67
Rub
-0.66
OPER
-0.66
TRY
-0.63
esan
-0.62
DOM
-0.61
ggle
-0.60
enforce
-0.60
POSITIVE LOGITS
disconnect
1.04
icut
0.88
owship
0.85
connection
0.78
nect
0.75
disconnected
0.75
racted
0.74
aline
0.73
yip
0.72
chart
0.72
Activations Density 0.013%