INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
ertia
-0.18
keit
-0.17
oui
-0.16
ayne
-0.14
urname
-0.14
soever
-0.14
ayed
-0.14
-Isl
-0.14
-Muslim
-0.14
beit
-0.14
POSITIVE LOGITS
Rim
0.15
exter
0.14
whenever
0.14
angu
0.13
ëĦ·
0.13
pest
0.13
rozen
0.13
Nicol
0.13
LEM
0.13
ume
0.13
Activations Density 0.000%
No Known Activations
This feature has no known activations.