INDEX
Explanations
direct references to or mentions of the word "it"
instances of the word "call" or its variations in various contexts
New Auto-Interp
Negative Logits
ashington
-0.77
oglu
-0.70
idth
-0.69
bral
-0.66
icka
-0.66
iston
-0.64
achev
-0.63
wa
-0.62
ittal
-0.62
isphere
-0.61
POSITIVE LOGITS
"#
0.78
qu
0.78
whatever
0.75
bluff
0.74
heresy
0.72
'
0.72
ãĢİ
0.70
anything
0.69
"
0.69
treason
0.65
Activations Density 0.063%