INDEX
Explanations
proper nouns and names of people
mentions of individuals and their statements
New Auto-Interp
Negative Logits
ģ«
-0.75
tune
-0.73
stolen
-0.66
ween
-0.62
CLS
-0.62
butterflies
-0.62
plete
-0.61
concurrent
-0.61
surfing
-0.60
trapping
-0.59
POSITIVE LOGITS
utenberg
0.81
presided
0.77
oyer
0.76
chaired
0.73
ynski
0.73
apologised
0.72
iman
0.72
thanked
0.71
urai
0.71
oversaw
0.71
Activations Density 0.218%