INDEX
Explanations
former roles and professions
New Auto-Interp
Negative Logits
目前
0.58
т
0.49
currently
0.48
on
0.47
vengono
0.46
Currently
0.45
actuelle
0.45
attualmente
0.45
ам
0.45
后
0.45
POSITIVE LOGITS
ehemaligen
0.71
býval
0.69
舊
0.64
former
0.61
formerly
0.61
autrefois
0.60
ehemalige
0.58
former
0.57
Former
0.56
Former
0.54
Activations Density 0.025%