INDEX
Explanations
text that highlights personal relationships and social interactions
New Auto-Interp
Negative Logits
isu
-0.17
"(\<
-0.17
ãģĵãĤĵãģ«ãģ¡ãģ¯
-0.15
سÙģ
-0.15
ichert
-0.15
STYPE
-0.14
ivet
-0.14
arging
-0.14
serter
-0.14
覧
-0.14
POSITIVE LOGITS
Dr
0.38
Professor
0.32
Dr
0.31
Mr
0.31
Mr
0.27
Professor
0.26
Prof
0.25
Senator
0.25
Judge
0.24
Ms
0.24
Activations Density 0.350%