INDEX
Explanations
proper nouns related to locations or individuals
instances of the word "Glad" and its variations
New Auto-Interp
Negative Logits
lished
-0.70
indo
-0.70
defe
-0.66
aeda
-0.62
Instruments
-0.59
Polo
-0.58
repre
-0.58
dexter
-0.56
educating
-0.56
ELL
-0.55
POSITIVE LOGITS
imir
1.14
ys
1.07
bach
1.00
iol
0.94
ewater
0.90
ysc
0.85
well
0.85
isl
0.84
der
0.83
Tid
0.80
Activations Density 0.037%