INDEX
Explanations
prepositions followed by subsequent actions or detailed descriptions
the phrase "As for" followed by various topics or subjects
New Auto-Interp
Negative Logits
eps
-0.92
ĸļ
-0.83
marine
-0.80
aukee
-0.75
CCC
-0.74
DEV
-0.73
DOC
-0.68
RFC
-0.66
cycle
-0.65
rift
-0.65
POSITIVE LOGITS
bidden
0.84
example
0.81
fairness
0.79
clarification
0.77
wards
0.77
gery
0.77
instance
0.76
awhile
0.74
laz
0.73
disclaim
0.71
Activations Density 0.030%