INDEX
Explanations
instances of the word "say" or its variations and related expressions indicating speech or assertion
New Auto-Interp
Negative Logits
ikk
-0.73
summary
-0.72
appers
-0.70
iets
-0.66
oggle
-0.66
1945
-0.65
rows
-0.65
uga
-0.64
onds
-0.63
obar
-0.63
POSITIVE LOGITS
coinc
0.68
...?
0.59
therein
0.58
)?
0.58
...)
0.57
ably
0.57
inc
0.56
eth
0.56
â̦)
0.56
Ħ¢
0.56
Activations Density 0.025%