INDEX
Explanations
names and specific terms related to characters or entities within a narrative
New Auto-Interp
Negative Logits
士
-0.77
¯¯¯¯
-0.77
ãģ®éŃĶ
-0.77
¯¯
-0.72
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
-0.71
ndra
-0.69
faint
-0.69
ĪĴ
-0.69
perature
-0.69
Else
-0.68
POSITIVE LOGITS
hod
1.00
rolet
0.96
opa
0.95
ozo
0.91
oton
0.89
aza
0.89
dos
0.88
uid
0.88
ados
0.85
uri
0.85
Activations Density 0.006%