INDEX
Explanations
new information or reports
the presence of the word "new" in various contexts
New Auto-Interp
Negative Logits
cius
-0.99
actionGroup
-0.78
Zip
-0.78
ashtra
-0.74
lua
-0.72
omever
-0.72
rice
-0.69
mop
-0.67
urations
-0.67
usercontent
-0.67
POSITIVE LOGITS
bie
1.12
YORK
0.98
bies
0.96
arrivals
0.94
additions
0.91
revelations
0.90
developments
0.90
batch
0.88
generation
0.87
revelation
0.86
Activations Density 0.093%