INDEX
Explanations
phrases containing the word "call" with high activation values
instances where something is being referred to or named
New Auto-Interp
Negative Logits
ooter
-0.79
ysc
-0.74
earances
-0.69
psons
-0.69
teasp
-0.69
olith
-0.66
\\\\\\\\
-0.66
lite
-0.66
avy
-0.65
eor
-0.65
POSITIVE LOGITS
termed
0.76
pires
0.68
Fra
0.68
selves
0.66
called
0.64
calls
0.63
Fra
0.63
calling
0.62
'
0.62
('0.61
Activations Density 0.036%