INDEX
Explanations
the word "Our" when capitalized
references to specific numerical values and statistical data
New Auto-Interp
Negative Logits
uti
-0.72
ila
-0.57
idel
-0.54
rots
-0.54
rence
-0.53
ills
-0.51
Joshua
-0.51
sqor
-0.50
wcs
-0.50
ola
-0.50
POSITIVE LOGITS
-
1.29
-'
1.11
_>
1.06
âĢij
1.04
-)
0.98
ÙĴ
0.96
-(
0.95
-$
0.94
-|
0.94
-.
0.93
Activations Density 0.213%