INDEX
Explanations
references to specific entities or important subjects in a discussion
New Auto-Interp
Negative Logits
<bos>
-0.62
Agamemnon
-0.52
chagrin
-0.52
Phry
-0.50
Phoen
-0.49
Pharaoh
-0.49
nukes
-0.48
Tbilisi
-0.48
try
-0.48
condoms
-0.47
POSITIVE LOGITS
this
1.21
these
1.21
THESE
1.13
These
1.12
THIS
1.11
These
1.10
THIS
1.10
particular
1.05
"):
1.04
deste
1.03
Activations Density 1.337%