INDEX
Explanations
names of universities or colleges
proper nouns, particularly names of people and locations
New Auto-Interp
Negative Logits
raints
-0.76
د
-0.74
aneously
-0.69
squats
-0.67
"-
-0.67
ively
-0.64
ÙĦ
-0.64
heimer
-0.63
rikes
-0.62
DERR
-0.61
POSITIVE LOGITS
stal
1.06
Bry
0.99
sta
0.91
llo
0.87
utral
0.86
cipled
0.84
giene
0.82
gments
0.81
sonian
0.81
cle
0.81
Activations Density 0.014%