INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
Ta
-0.70
porous
-0.68
Hatch
-0.68
Bake
-0.67
parting
-0.66
liking
-0.65
rapport
-0.63
Bang
-0.63
Spect
-0.63
Passage
-0.62
POSITIVE LOGITS
bis
0.86
ibrary
0.83
APD
0.82
afort
0.77
glers
0.75
UV
0.74
calling
0.74
MIC
0.73
Dex
0.72
wd
0.72
Activations Density 0.000%
No Known Activations
This feature has no known activations.