INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
eal
-0.84
olean
-0.76
ocative
-0.76
Ibid
-0.74
usive
-0.72
earch
-0.69
amy
-0.67
emin
-0.67
evaluate
-0.66
atisf
-0.65
POSITIVE LOGITS
now
1.90
now
1.21
Now
1.12
NOW
1.11
Now
1.07
still
0.73
currently
0.72
NOW
0.72
already
0.70
presently
0.68
Activations Density 0.000%
No Known Activations
This feature has no known activations.