INDEX
Explanations
names of places or locations
occurrences of the word "data."
New Auto-Interp
Negative Logits
enegger
-0.72
paren
-0.70
tails
-0.68
taining
-0.64
convict
-0.64
sted
-0.63
Ö¼
-0.61
false
-0.59
starter
-0.59
premise
-0.59
POSITIVE LOGITS
iba
0.96
eus
0.95
ña
0.94
ata
0.92
hedral
0.91
hea
0.88
illac
0.87
isy
0.87
ñ
0.86
ichi
0.83
Activations Density 0.009%