INDEX
Explanations
phrases related to physical connections or relationships
the frequency of the word "to" in various contexts
New Auto-Interp
Negative Logits
parency
-0.70
boycot
-0.69
calling
-0.66
sponsored
-0.65
hur
-0.64
announced
-0.63
rio
-0.62
comments
-0.62
vetting
-0.62
åij
-0.61
POSITIVE LOGITS
ggles
1.12
othy
0.98
accommodate
0.96
minimize
0.95
maximize
0.92
compensate
0.91
conserve
0.90
ensure
0.87
shore
0.87
avoid
0.85
Activations Density 0.341%