INDEX
Explanations
occurrences of the word "wh"
New Auto-Interp
Negative Logits
Blazers
-0.70
Duo
-0.68
BMC
-0.66
Sno
-0.65
mosaic
-0.63
WARE
-0.63
Grande
-0.61
liability
-0.61
styled
-0.61
clearance
-0.61
POSITIVE LOGITS
ilst
1.49
istle
1.39
olly
1.22
olen
1.17
soever
1.16
ats
1.15
ipl
1.14
ole
1.14
ichever
1.14
irling
1.13
Activations Density 0.004%