INDEX
Explanations
phrases related to someone's past experiences or backgrounds, particularly focusing on various life roles or professions
the word "as" in a variety of contexts
New Auto-Interp
Negative Logits
ongyang
-0.74
aceae
-0.73
redits
-0.70
¨
-0.70
autions
-0.70
rous
-0.67
ombs
-0.67
jas
-0.65
raq
-0.65
gin
-0.64
POSITIVE LOGITS
pired
1.38
pires
1.27
well
0.90
part
0.90
ynchron
0.86
teenager
0.85
piring
0.84
mayor
0.81
opposed
0.80
deputy
0.80
Activations Density 0.130%