INDEX
Explanations
defensive terms and concepts
New Auto-Interp
Negative Logits
z
0.66
顱
0.57
to
0.56
>\<^
0.54
eritud
0.52
burb
0.52
tica
0.52
da
0.51
acie
0.50
de
0.50
POSITIVE LOGITS
ре
0.70
بر
0.65
ли
0.64
ला
0.63
ما
0.61
ור
0.60
ку
0.59
Players
0.59
ap
0.58
اس
0.57
Activations Density 0.003%