INDEX
Explanations
words related to personal relationships or individuals
references to personal relationships or social dynamics
New Auto-Interp
Negative Logits
SIM
-0.62
UCT
-0.61
sequ
-0.60
VIDEOS
-0.58
Module
-0.57
DX
-0.57
Integ
-0.56
rogens
-0.56
ilion
-0.55
ILLE
-0.54
POSITIVE LOGITS
who
1.85
whom
1.85
who
1.79
whose
1.44
whose
1.42
Who
1.08
Who
1.04
fame
0.97
agher
0.94
wishing
0.87
Activations Density 0.900%