INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
ijn
-0.72
fine
-0.72
clair
-0.71
sterdam
-0.71
ventory
-0.70
luence
-0.70
cius
-0.70
lic
-0.68
license
-0.68
lication
-0.68
POSITIVE LOGITS
Technician
0.74
technicians
0.69
grandparents
0.60
Corinth
0.60
nevertheless
0.60
Nurs
0.59
Patriarch
0.59
horm
0.59
ij士
0.58
holiest
0.57
Activations Density 0.000%
No Known Activations
This feature has no known activations.