INDEX
Explanations
prominent figures or individuals mentioned in text
references to individuals labeled as prominent figures in various contexts
New Auto-Interp
Negative Logits
Phones
-0.84
ramid
-0.79
ope
-0.78
agan
-0.77
ework
-0.75
thur
-0.74
yrinth
-0.73
©¶æ
-0.73
shall
-0.71
opes
-0.71
POSITIVE LOGITS
figures
0.81
personalities
0.80
role
0.77
finan
0.77
prominent
0.77
landmarks
0.76
commentator
0.74
contributor
0.73
bearer
0.72
anti
0.71
Activations Density 0.039%