INDEX
Explanations
words related to relationships between people
words related to personal relationships and their dynamics
New Auto-Interp
Negative Logits
icter
-0.77
isoft
-0.76
nin
-0.75
rikes
-0.72
ocument
-0.70
mart
-0.70
atel
-0.70
orah
-0.70
strip
-0.69
obi
-0.69
POSITIVE LOGITS
regard
0.92
regards
0.88
respect
0.75
impunity
0.73
whom
0.72
fellow
0.70
Jane
0.70
stood
0.69
Jared
0.69
Samantha
0.68
Activations Density 0.097%