INDEX
Explanations
names or terms related to individuals or organizations
mentions of specific names or entities, particularly related to people involved in notable events or discussions
New Auto-Interp
Negative Logits
poses
-1.01
nces
-1.00
ample
-0.97
nsic
-0.92
posing
-0.87
sts
-0.81
cium
-0.80
velength
-0.79
terday
-0.79
pose
-0.78
POSITIVE LOGITS
olitan
0.80
Stories
0.74
HAEL
0.74
osta
0.73
orter
0.73
urized
0.72
sburgh
0.72
helle
0.70
eas
0.70
hew
0.68
Activations Density 0.059%