INDEX
Explanations
the word "husband"
the term "husband" in various contexts
New Auto-Interp
Negative Logits
JPM
-0.77
McC
-0.74
ourcing
-0.74
spir
-0.74
ortmund
-0.73
EVA
-0.71
Ukrain
-0.69
Races
-0.68
uum
-0.67
Fever
-0.67
POSITIVE LOGITS
pins
0.84
pin
0.81
hood
0.76
ton
0.72
husband
0.69
Philip
0.68
loo
0.68
ndra
0.68
friend
0.68
dad
0.66
Activations Density 0.027%