INDEX
Explanations
sentences discussing hypothetical situations or future actions
expressions of future possibilities and hypothetical scenarios
New Auto-Interp
Negative Logits
Fine
-0.63
artisan
-0.62
Tick
-0.62
ancer
-0.61
LTD
-0.59
Shooting
-0.58
Perfect
-0.57
Fancy
-0.57
Arcade
-0.56
Provided
-0.56
POSITIVE LOGITS
be
0.94
entail
0.84
avail
0.84
tolerate
0.80
feas
0.78
achieve
0.77
»Ĵ
0.77
survive
0.77
arrive
0.76
accomplish
0.76
Activations Density 0.155%