INDEX
Explanations
phrases or sentences that contain specific names, potentially related to articles, news reports, or discussions
references to a specific object or term likely associated with titles or names in a structured format
New Auto-Interp
Negative Logits
Leilan
-0.76
igil
-0.75
arium
-0.72
enery
-0.71
Magikarp
-0.70
forward
-0.69
Sons
-0.69
forth
-0.69
burg
-0.66
ivities
-0.66
POSITIVE LOGITS
JECT
1.32
BY
1.06
OB
1.02
ject
0.97
tained
0.96
rien
0.91
ooth
0.91
lique
0.91
ilib
0.90
acter
0.88
Activations Density 0.007%