INDEX
Explanations
question words like "What" or "Who"
occurrences of the word "Wh"
New Auto-Interp
Negative Logits
Grande
-0.67
WARE
-0.67
Duo
-0.65
Awakening
-0.61
ULAR
-0.61
Blazers
-0.61
Strauss
-0.61
Letter
-0.59
Barton
-0.59
sten
-0.57
POSITIVE LOGITS
istle
1.43
ilst
1.35
irlwind
1.26
olly
1.21
ispers
1.20
soever
1.17
isky
1.16
olen
1.11
irling
1.10
ichever
1.08
Activations Density 0.023%