INDEX
Explanations
core idea, crucial, effective
New Auto-Interp
Negative Logits
feme
0.75
社員
0.74
persoane
0.74
офі
0.72
………..
0.72
Woman
0.71
महिला
0.70
osoby
0.70
Ulster
0.69
ренные
0.69
POSITIVE LOGITS
intuition
1.56
heuristics
1.41
intuitive
1.40
intuitively
1.39
heuristic
1.29
computationally
1.18
empirically
1.17
computational
1.15
intuit
1.15
intu
1.14
Activations Density 1.900%