INDEX
Explanations
the word "Anyone" in the text
references to the word "anyone."
New Auto-Interp
Negative Logits
irth
-0.65
urations
-0.62
ories
-0.62
complete
-0.62
ÃŁ
-0.62
Labor
-0.61
heny
-0.61
rick
-0.59
Lawn
-0.59
Maze
-0.59
POSITIVE LOGITS
else
1.76
Else
1.19
else
1.15
Else
1.15
THING
0.99
20439
0.94
doubted
0.93
who
0.88
imaginable
0.86
omever
0.86
Activations Density 0.025%