INDEX
Explanations
describing humans or personal characteristics
New Auto-Interp
Negative Logits
几何
0.35
可视化
0.33
物理
0.32
системы
0.32
ToolStrip
0.31
инфраструк
0.31
散热
0.30
cinematographer
0.30
ୋ
0.30
উপকূল
0.30
POSITIVE LOGITS
Biographical
0.37
인간
0.37
humain
0.37
humanas
0.36
personal
0.36
humana
0.35
mammalian
0.35
Personnel
0.33
Personal
0.33
نمی
0.32
Activations Density 0.499%