INDEX
Explanations
phrases related to locations or geopolitical entities
sentence-ending punctuation marks, specifically periods
New Auto-Interp
Negative Logits
advoc
-0.92
challeng
-0.92
destro
-0.84
mosqu
-0.78
glim
-0.77
withd
-0.77
nodd
-0.76
compr
-0.74
onga
-0.73
predec
-0.72
POSITIVE LOGITS
Besides
0.99
Whether
0.97
Especially
0.96
They
0.94
Its
0.94
While
0.93
Though
0.92
If
0.92
Whilst
0.90
Although
0.90
Activations Density 0.393%