INDEX
Explanations
phrases that introduce examples or hypothetical scenarios
New Auto-Interp
Negative Logits
Plenum
-0.63
مشين
-0.62
unica
-0.60
thea
-0.58
église
-0.57
castelo
-0.54
compétence
-0.53
''.
-0.53
Portanto
-0.52
Mui
-0.52
POSITIVE LOGITS
Например
0.89
bijvoorbeeld
0.84
Например
0.83
exempel
0.82
например
0.82
example
0.81
たとえば
0.81
Misalnya
0.79
třeba
0.76
beispielsweise
0.76
Activations Density 0.151%