INDEX
Explanations
questions words in various contexts
the repeated use of the 'WH' question words, indicating inquiries or prompts related to "who," "what," "where," "when," and "why."
New Auto-Interp
Negative Logits
bart
-0.66
Grande
-0.63
Sunshine
-0.63
uating
-0.62
uated
-0.62
Defenders
-0.60
clearance
-0.60
advance
-0.60
Bucks
-0.59
Replay
-0.59
POSITIVE LOGITS
soever
1.24
ilst
1.20
ispers
1.09
olly
1.04
istle
1.00
ipl
0.95
irlwind
0.94
itness
0.93
ither
0.90
oles
0.90
Activations Density 0.009%