INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
oster
-0.66
earnest
-0.64
ãĤ¹
-0.62
itious
-0.62
Neo
-0.61
..............
-0.60
lowly
-0.59
knit
-0.58
allel
-0.58
assic
-0.58
POSITIVE LOGITS
BOOK
0.76
raf
0.73
confir
0.68
zees
0.66
eat
0.64
trial
0.64
Democr
0.61
ocr
0.61
Marginal
0.61
oÄŁ
0.61
Activations Density 0.000%
No Known Activations
This feature has no known activations.