INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
CT
-0.72
acea
-0.72
assin
-0.70
tein
-0.69
tz
-0.69
Citiz
-0.67
[/
-0.65
liest
-0.64
hens
-0.64
atics
-0.63
POSITIVE LOGITS
barriers
0.66
depend
0.65
ceilings
0.62
products
0.62
Rite
0.61
Dir
0.60
classmate
0.59
Chambers
0.58
ooks
0.58
Mile
0.58
Activations Density 0.000%
No Known Activations
This feature has no known activations.