INDEX
Explanations
phrases indicating examples or descriptions related to specific concepts
typical examples or possibilities
New Auto-Interp
Negative Logits
&___
-0.44
ckså
-0.44
ostante
-0.43
gwaran
-0.39
Kontrola
-0.39
istnieje
-0.39
Cyfeiriadau
-0.38
zewnętrzne
-0.38
tershire
-0.38
uteen
-0.38
POSITIVE LOGITS
typical
0.95
typical
0.83
Typical
0.81
Typical
0.79
typically
0.74
典型
0.69
typique
0.69
típico
0.68
typ
0.68
典型的
0.65
Activations Density 0.088%