INDEX
Explanations
names of people or characters mentioned in the text
proper nouns, specifically names of people and organizations
New Auto-Interp
Negative Logits
ashtra
-0.76
Pod
-0.75
Þ
-0.72
ãĥŁ
-0.69
PDATE
-0.68
VALUE
-0.67
··
-0.67
âĶĢâĶĢ
-0.67
TING
-0.67
DATA
-0.66
POSITIVE LOGITS
isner
0.93
monds
0.88
verson
0.87
gins
0.78
iffin
0.78
sworth
0.77
ridge
0.74
McC
0.73
ahl
0.73
beard
0.72
Activations Density 0.237%