INDEX
Explanations
proper names of people
mentions of specific individuals, particularly those named Nicholas and Christina
New Auto-Interp
Negative Logits
stall
-1.09
marked
-0.94
alling
-0.91
views
-0.90
ebook
-0.88
rior
-0.85
aby
-0.83
gress
-0.83
front
-0.81
ishing
-0.81
POSITIVE LOGITS
ource
0.79
Hernandez
0.79
aurus
0.76
Briggs
0.74
terday
0.74
hift
0.73
Celest
0.73
Isa
0.72
Gustav
0.72
Bender
0.72
Activations Density 0.035%