INDEX
Explanations
phrases indicating a continuation or transition in speech or writing, often introducing new information or emphasizing a point
New Auto-Interp
Negative Logits
DISTR
-0.76
CPR
-0.59
Mobil
-0.59
pulp
-0.58
bluff
-0.58
shack
-0.58
Mirage
-0.57
RAD
-0.57
Monkey
-0.57
Monroe
-0.57
POSITIVE LOGITS
Pg
0.86
ttp
0.79
s
0.79
shall
0.77
aternity
0.77
ould
0.76
ept
0.75
¼
0.74
else
0.73
ı
0.72
Activations Density 0.137%