INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
circa
0.68
nelle
0.66
or
0.63
nell
0.63
after
0.62
so
0.61
nel
0.61
tempo
0.61
conno
0.61
comparable
0.60
POSITIVE LOGITS
ش
0.93
ور
0.73
نا
0.72
Puedes
0.72
وا
0.67
Puede
0.67
ن
0.66
ض
0.64
ها
0.64
أم
0.63
Activations Density 0.000%
No Known Activations
This feature has no known activations.