INDEX
Explanations
numeric values or expressions in the text
New Auto-Interp
Negative Logits
er
-1.12
-0.72
erl
-0.69
spar
-0.69
spar
-0.69
قار
-0.67
Feld
-0.65
irot
-0.64
anair
-0.64
Rana
-0.64
POSITIVE LOGITS
7
1.95
Seventh
1.34
SEVEN
1.34
SEVEN
1.26
seventh
1.22
Seventh
1.21
Seven
1.21
Siete
1.20
seventh
1.17
VII
1.16
Activations Density 0.672%