INDEX
Explanations
references to social media professionals and their roles
New Auto-Interp
Negative Logits
Wife
-0.17
Families
-0.15
妻
-0.15
wife
-0.15
wife
-0.15
raud
-0.14
itizen
-0.14
ÑĦоÑĢ
-0.14
achment
-0.13
idon
-0.13
POSITIVE LOGITS
Gran
0.17
Bip
0.16
halluc
0.16
suicidal
0.16
Pop
0.15
bipolar
0.15
something
0.15
Gran
0.15
text
0.15
herself
0.15
Activations Density 0.009%