INDEX
Explanations
proper names, possibly focusing on individual people
references to specific names and key figures in the context of personal narratives or stories
New Auto-Interp
Negative Logits
Hamb
-0.79
Java
-0.76
Ó
-0.75
åĤ
-0.75
hamb
-0.75
hya
-0.74
caffe
-0.74
jiang
-0.74
ãĥĺ
-0.73
MLA
-0.73
POSITIVE LOGITS
Ross
2.35
Ross
2.11
Ros
1.85
Ros
1.85
Roz
1.47
Rossi
1.45
ross
1.26
ROS
1.18
Rosenthal
1.16
Cos
1.12
Activations Density 0.271%