INDEX
Explanations
sentence starters
common introductory phrases or transition words in sentences
New Auto-Interp
Negative Logits
alike
-0.67
":[
-0.62
ucket
-0.60
imposed
-0.59
itud
-0.58
wrapper
-0.57
Erie
-0.56
aukee
-0.56
¶
-0.55
ea
-0.55
POSITIVE LOGITS
edes
0.80
resy
0.76
reditary
0.73
cknowled
0.68
letes
0.66
ructose
0.65
ixel
0.64
olon
0.64
Kap
0.61
Patton
0.60
Activations Density 0.186%