INDEX
Explanations
percentage numbers and units
New Auto-Interp
Negative Logits
ט
0.50
ור
0.49
N
0.49
ى
0.46
ró
0.44
ూ
0.44
enaar
0.43
Temmuz
0.43
า
0.42
}$=
0.42
POSITIVE LOGITS
is
0.56
contra
0.54
_
0.50
__
0.49
fission
0.49
ions
0.48
in
0.46
impi
0.46
adversarial
0.46
valuables
0.45
Activations Density 0.204%