INDEX
Explanations
locations or directions mentioned in text
occurrences of the word "from."
New Auto-Interp
Negative Logits
ratulations
-0.82
rongh
-0.74
faced
-0.73
mask
-0.72
leneck
-0.71
isphere
-0.70
few
-0.69
important
-0.67
ascript
-0.67
certain
-0.66
POSITIVE LOGITS
afar
1.35
whence
1.14
thence
0.99
abroad
0.90
scratch
0.90
inside
0.84
elsewhere
0.81
anywhere
0.79
within
0.78
somewhere
0.76
Activations Density 0.235%