INDEX
Explanations
mentions of specific names or terms related to various contexts which include individuals, locations, and concepts
occurrences of names and references related to particular individuals
New Auto-Interp
Negative Logits
elig
-0.94
bluff
-0.69
lished
-0.62
diploma
-0.60
entrants
-0.60
runway
-0.57
$$$$
-0.57
seism
-0.57
crawl
-0.56
RU
-0.56
POSITIVE LOGITS
igans
0.96
felt
0.84
frames
0.76
oused
0.73
chenko
0.73
nian
0.73
velt
0.71
aku
0.71
zhen
0.68
idan
0.67
Activations Density 0.108%