INDEX
Explanations
occurrences of the word "taken"
instances of the word "taken" in various contexts
New Auto-Interp
Negative Logits
ulo
-0.78
lich
-0.75
tor
-0.64
eers
-0.62
tions
-0.62
ler
-0.61
tion
-0.60
hips
-0.60
regress
-0.59
cape
-0.58
POSITIVE LOGITS
aback
1.39
aways
1.04
advantage
0.90
care
0.88
cogn
0.86
ãĤ¤ãĥĪ
0.81
OVER
0.81
heed
0.80
away
0.79
INTO
0.77
Activations Density 0.034%