INDEX
Explanations
starts with "It", "You", or "company"
New Auto-Interp
Negative Logits
pointer
0.47
برو
0.45
Лей
0.45
бюро
0.43
ాయి
0.42
spaw
0.42
らない
0.41
仅
0.41
አይ
0.39
Боли
0.39
POSITIVE LOGITS
ang
0.47
ikon
0.46
akala
0.46
pues
0.45
وأضاف
0.45
inicio
0.44
ots
0.44
endaten
0.43
iz
0.43
ari
0.43
Activations Density 0.010%