INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
haar
-0.80
srf
-0.67
mathemat
-0.66
compr
-0.66
jri
-0.65
lamm
-0.65
relat
-0.64
ipel
-0.63
looting
-0.63
describ
-0.63
POSITIVE LOGITS
Office
0.85
bird
0.78
services
0.76
ridges
0.73
ieu
0.70
office
0.69
mate
0.67
inder
0.67
doors
0.66
eon
0.65
Activations Density 0.000%
No Known Activations
This feature has no known activations.