INDEX
Explanations
instances of specific numeric values and their potential implications
phrases related to sections or paragraphs
New Auto-Interp
Negative Logits
neighb
-0.77
citiz
-0.77
aditional
-0.76
Þ
-0.72
ageing
-0.70
nomine
-0.68
individuality
-0.68
footing
-0.67
proport
-0.67
newcom
-0.65
POSITIVE LOGITS
³³³³³³³³
0.89
³³³
0.89
³³³³³³³³³³³³³³³³
0.87
³³
0.87
SPONSORED
0.84
Sounds
0.83
³³³³
0.81
????????
0.81
Ironically
0.80
Notice
0.80
Activations Density 0.329%