INDEX
Explanations
entities mentioned alongside a specific numerical value
phrases that indicate authorship or attribution in historical contexts
New Auto-Interp
Negative Logits
vc
-0.80
Netflix
-0.75
igun
-0.75
ploy
-0.74
verage
-0.72
internet
-0.71
ahime
-0.71
wcs
-0.69
register
-0.68
TPP
-0.68
POSITIVE LOGITS
Johann
1.38
Augustus
1.35
Herod
1.33
Raphael
1.28
Sir
1.26
Napoleon
1.25
Rabbi
1.25
Moses
1.25
Franz
1.24
Ibn
1.23
Activations Density 0.370%