INDEX
Explanations
names or terms related to specific places or individuals
mentions of specific names or locations
New Auto-Interp
Negative Logits
oured
-0.77
Apostles
-0.76
mount
-0.73
天
-0.70
gemony
-0.69
ently
-0.67
AGE
-0.63
hett
-0.61
ouring
-0.61
Aval
-0.60
POSITIVE LOGITS
imura
0.87
gnu
0.87
sth
0.80
isites
0.78
emort
0.77
thread
0.76
jiang
0.75
ikuman
0.75
zers
0.75
san
0.75
Activations Density 0.016%