INDEX
Explanations
words starting with 'sw'
instances of the term "sw" in various contexts
New Auto-Interp
Negative Logits
marginal
-0.64
Rubin
-0.64
CST
-0.62
percentile
-0.60
frust
-0.59
Gilbert
-0.59
ilon
-0.59
Dome
-0.58
Burnett
-0.57
relig
-0.57
POSITIVE LOGITS
sw
4.40
SW
2.01
Sw
1.97
swe
1.56
Sw
1.55
sword
1.48
sm
1.37
swer
1.37
sp
1.37
sw
1.35
Activations Density 0.003%