INDEX
Explanations
phrases related to performing actions one at a time
phrases that indicate frequency and timing
New Auto-Interp
Negative Logits
drawn
-0.85
leased
-0.76
utters
-0.74
roth
-0.74
erest
-0.73
çͰ
-0.71
redit
-0.70
ships
-0.67
wr
-0.66
vik
-0.65
POSITIVE LOGITS
istg
0.72
secut
0.71
perimeter
0.68
rotation
0.68
ratio
0.68
gallon
0.65
rectangle
0.63
average
0.62
metric
0.62
sandwich
0.62
Activations Density 0.156%