INDEX
Explanations
proper nouns related to a specific name or concept ("ram")
references to the name "Eram."
New Auto-Interp
Negative Logits
Yankee
-0.68
Carib
-0.65
predic
-0.61
indifference
-0.61
subp
-0.59
pitcher
-0.59
LCS
-0.57
urses
-0.57
Panthers
-0.55
bed
-0.54
POSITIVE LOGITS
irez
1.18
ming
1.11
bling
1.03
matical
1.02
mers
1.02
pton
1.01
bler
1.01
atic
0.97
med
0.96
bles
0.92
Activations Density 0.014%