INDEX
Explanations
numerical data or references related to experimental results
citations in academic texts
New Auto-Interp
Negative Logits
ImageContext
-0.49
localctx
-0.42
TemporalType
-0.41
IndentedString
-0.41
démocratique
-0.41
WebElementEntity
-0.39
fromLTRB
-0.39
democrá
-0.38
שוליים
-0.36
armar
-0.36
POSITIVE LOGITS
ſta
0.62
myſelf
0.54
raiſ
0.53
himſelf
0.49
againſt
0.48
ſever
0.47
pleaſure
0.46
inſ
0.46
ticides
0.45
ſaid
0.45
Activations Density 0.046%