INDEX
Explanations
phrases indicating physical movement towards a certain direction
instances of the word "over"
New Auto-Interp
Negative Logits
Lauder
-0.70
ãĥ¼ãĥĨãĤ£
-0.68
illary
-0.65
ORY
-0.65
onna
-0.65
understatement
-0.65
iation
-0.64
nesota
-0.60
ivity
-0.60
oko
-0.59
POSITIVE LOGITS
drive
1.10
loading
0.95
hang
0.93
tones
0.93
kill
0.90
lord
0.89
rule
0.85
clock
0.81
sold
0.79
priced
0.78
Activations Density 0.066%