INDEX
Explanations
the word "anyone" with a relatively high activation value
the word "anyone"
the word "anyone" and its variations throughout the text
New Auto-Interp
Negative Logits
pa
-0.65
ories
-0.63
ritz
-0.61
Kitchen
-0.61
Congo
-0.60
Labor
-0.60
Shows
-0.59
BDS
-0.59
Maze
-0.59
neck
-0.59
POSITIVE LOGITS
else
1.55
THING
1.26
Else
1.14
Else
1.03
else
0.98
soever
0.97
imaginable
0.88
20439
0.87
omever
0.87
doubted
0.86
Activations Density 0.017%