INDEX
Explanations
agatha Christie and Eurovision
New Auto-Interp
Negative Logits
ر
0.91
ور
0.86
р
0.72
vecb
0.69
chsler
0.67
ра
0.66
ш
0.66
㑓
0.66
طان
0.65
या
0.63
POSITIVE LOGITS
↵↵
0.81
לי
0.78
}_
0.76
'
0.75
_
0.74
and
0.71
questa
0.69
ל
0.69
และ
0.69
ק
0.68
Activations Density 0.001%