INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
daycare
1.46
housework
1.39
patriotism
1.35
mensen
1.32
возле
1.31
žena
1.30
restoran
1.29
adoration
1.29
wages
1.28
homestead
1.28
POSITIVE LOGITS
F
1.21
R
1.17
(
1.16
PhysRev
1.16
B
1.16
V
1.15
(
1.14
outperforms
1.09
optimized
1.08
optimized
1.08
Activations Density 0.828%